Theiling Online    Sitemap    Conlang Mailing List HQ   

Re: Dublex (was: Washing-machine words (was: Futurese, Chinese,

From:Jeffrey Henning <jeffrey@...>
Date:Monday, May 20, 2002, 3:12
And Rosta <a-rosta@...> comunu:

> How does the Dublex programme go about doing this? For example, are there > different candidate sets of 400? -- E.g. you might start with 1000 and > choose the 400 best. Or you might start with no upper number, but take > words from English and replace them by compounds, recursively replacing > the constituents of compounds by further compounds, until you end up > with 400 'atomic' morphemes. And how is the usefulness of a word > measured?
Good questions! Basically, I first developed the 400-root word list by studying the Universal Language Dictionary (the most comprehensive short wordlist around, IMO), the Lojban gismu, Basic English and Esperanto. I added a few words that I wanted to make sure were included so that I could describe the language in the language (e.g., 'nomin' and 'verb' for "noun" and "verb"). The initial 400 was my *subjective* take on the 400 roots that would be most productive. While I have locked in the idea of using 400 roots,* I want the morpheme list to evolve and improve over time. So I apply the concept of survival of the fittest to the 400 morphemes. The weakest morphemes of the herd can be killed off by new stronger morphemes. The strength of a morpheme equals the number of two-compound words that can be formed from it. The strongest compound would form 399 words with it as the modifying morpheme and 399 words with it as the base morpheme for a strength of 798. In practice the current average root has a strength of 24 right now, meaning each root forms 24 two-word compounds (but the median strength is 14). Here's how this evolution works in practice: Suhvoclete Repfaba Sist [Root Revision System.] 1. Choose 40 roots at random (10% of the roots). 2. Coin compound words from these using the root you want to deprecate and the root you want to add. 3. Multiply the productivity of each by 10 to estimate the productivity with all 400 roots. 4. If the new root wins, add all of its coined words. Re-coin all compounds that use the old root. I just did this for "door" vs. "noun", and came up with an estimated productivity of 60 compounds for "door" vs. 10 for "noun". So "noun" will be culled from the herd (the only compounds from it were for "grammar" and "pronoun"). (*The number 400 itself was subjectively chosen, and one of the points of the Dublex experiment is to generate some statistics on the effect of morpheme count on word length.) Best regards, Jeffrey http://jeffrey.henning.com http://www.langmaker.com

Reply

And Rosta <a-rosta@...>