Theiling Online    Sitemap    Conlang Mailing List HQ   

Re: Dublex (was: Washing-machine words (was: Futurese, Chinese,

From:And Rosta <a-rosta@...>
Date:Friday, May 17, 2002, 2:01
Jeffrey:
> And Rosta comunu: > > > (1) Compounding is not the only alternative to creating a new and > > totally unanalysable root. There are various alternatives, including > > * arbitrary or quasi-systematic modification of an existing > > semantically related root or stem > > * derivational affixes > > * having very many roots, but organizing them into paradigms such > > that roots with related meaning have similar forms, possibly in > > a relatively systematic way > > I've decided that one of the most difficult aspects of learning a language > is mastering sufficient vocabulary. Contemplating this led to my decision > to design Dublex and to have the goal of a lexicon that was as easy to learn > as possible with the primary way of achieving that goal being keeping the > lexicon as small as feasible.
Fair enough. Though, as you must be aware, the size of the lexicon is equivalent to the number of meanings encoded by words. The smaller the lexicon, the fewer concepts are named by the language. So in my view, a small lexicon is pointless. The smaller the lexicon, the more circumlocution is required. Is it really so helpful to, say, have no word for 'platypus' and instead have to say 'egg-laying mammal with duck-like mouth and otter-like body'?
> I decided compounding was the simplest derivational morphology to learn.
Again, fair enough. But I think the considerations of aptness that I mentioned override this. Furthermore, a relatively opaque compound would be a positive hindrance to learning, if it leads the learner up the garden path.
> Before designing the Dublex phonology, I studied a number of books on > memorization, and these works said CVC forms were easier to learn than CV or > VC forms. While the original goal had been 400 CVC forms, it is even easier > to memorize roots with different shapes, which led me to expand the > morphology to five forms (CVC, CVCC, CVCVC, CVCVCC, CVCVCVC). This gives > the brain another point of difference in memorization.
More generally, I think it would be the case that having roots differing in several points of contrast would help memorization, or at least recognition. (Do vocab recognition and memorization work alike? Probably not.)
> > -- Sometimes, these alternatives yield apter stems than compounding > > does. A compound X+Y is apt if the denotatum is X and is Y, or if > > (in a head-final compound) it is a Y a salient characteristic of which > > is saliently associated with X. But not all new concepts can be > > expressed by such apt compounds. > > Yes, compounds have their strengths and weaknesses. I think you and I and > Ray agree that what makes language engineering so challenging and rewarding > is balancing the constant trade-offs. One of the goals of Dublex is to > develop the best vocabulary for compounding; i.e., to develop the lexicon > which forms more "apt compounds" than any other compounding (or even > affixing) lexicon.
Okay. But if the goal was maximal learnability, such heavy reliance on compounding might be excessive.
> > (2) Compounding yields unnecessarily long stems... > > If you care about concision -- and almost all language users do -- > > then you want words to be as short as possible. > > I would disagree, but just slightly. I would say, "If you care about > concision -- and almost all *fluent* users of a language do -- then you want > the *most common* words to be as short as possible."
Not just fluent users. I observe that even though concision was never a goal of Lojban -- and indeed the Lojban programme in principle requires users to ignore concision -- actual Lojban users are very very strongly influenced by considerations of concision. And nobody is fluent in Lojban -- well, maybe a small handful, but the concern for concision is not at all restricted to them. And it's not so much a case of wanting the most common words to be as short as possible as wanting the average sentence/clause/proposition to be as short as possible.
> Myself I probably care least about concision of any language *designer* I > know.
Surprisingly, it isn't one of the major preoccupations of conlangers or even of engelangers. For example, there's far more discussion of self-segmentation than of concision, even though the latter is of much greater practical utility. (Yet on the are occasions when people have been exposed to Livagian text, it is the concision that has attracted comment.)
> The heck with gemütlichkeit for experienced language users -- I'm > designing for intermediate language users who are constantly trying to > remember words and unravel texts that interest them. Longer forms offer > more clues. :-)
Ah, well this is a different criterion: how easy it is to get from having zero knowledge to being able to puzzle out the meaning of a text. And by this criterion, I think your approach is near-optimal. It's a language designed for beginner readers.
> Now that said, Dublex does have methods of shortening words once introduced: > acronyming a long compound for quick reference; abbreviating subsequent > appearances of the word. The first person to write an article in English on > "what-you-see-is-what-you-get graphical-user interfaces" quickly stumbled on > "WYSIWYG GUI" as a shorthand for the rest of the article. Dublex texts may > have 'catvoc' (cut+words) that are shortened forms of introduced terms. So > 'viclavmahin' might be referred to as 'viclav' without confusion later and > repeatedly.
How do these two devices work?
> > Moving on to the general Dublex experiment, I don't really see > > anything magically special about roots. The inventory of > > a language's morphological or etymological roots tends to be > > rather accidental -- accidents of history. > > Exactly. And with an engineered language you don't want accidents but you > want to systematically develop the most productive system you can. > > > > They don't represent > > semantic primitives or anything truly elemental to the cognitive > > structures underlying language. > > No, that's not a goal at all, and I would shudder that someone in the future > would ever claim Dublex's efficiency on this score somehow reflective of how > our brains work. > > > > Hence although your Dublex goal interests me by virtue of > > being an engelanging exercise, its specific goal is not one I > > myself think worthwhile to the world at large. > > Well, I did ask you for your opinion. :-)
I hadn't properly apprehended the goal of Dublex. You normally just present it as a 400-root compound-rich language. But now that I realize its real goal is to be an optimal language for beginner readers, I can appreciate it much better.
> It is certainly not of interest > to the world at large. Reasons I think it is of interest to the *conlang* > world at large: > - For artlangers, with very little work the system can suggest thousands of > new compounds, which they can adopt on a case by case basis into their > language. > - For engelangers, they can expand or contract the system. So if improving > concision is a goal, they can identify the 200 most used compounds in the > lexicon and expand the root system to have 600 roots, which would yield > shorter words in their language. If minimalizing the lexicon is a goal > (there are a number of ~100-word languages out there) they can work on that > as well by rephrasing Dublex roots as compounds in their reduced set. I > think the system could be very helpful for a developing a briefscript, as I > e-mailed Ray Brown earlier. > - For auxlangers, well, I know that one auxlanger is using the lexicon to > introduce the most apt of its compounds into his IAL, which primarily uses > derivational affixes of very broad ranges of meaning. > - Conlangers of all sorts looking to generate word lists often turn to Word > Net, Roget's Thesaurus or the Universal Language Dictionary for creating > unique roots. Up until now there hasn't been a useful list for generating > derivational lexicons.
I agree with all of this, especially if a high degree of quality-control is pre-applied to the lexicon, to ensure a high degree of aptness. For my own conlanging, the 400 roots are an irrelevance, but a list of high quality compounds could be quite useful. --And.