Conlang: Re: Most developed conlang (Henrik Theiling, Apr 25 '07, 11:36)

From:	Henrik Theiling <theiling@...>
Date:	Wednesday, April 25, 2007, 11:36

Hi! Jim Henry writes:

> > > el- don -ej -o > > > 0.95, 0.20, 0.0 > .... > > > or thereabouts. How would we combine these to > > > get an overall opacity score for the word? > > > The total score should of course be the product of those values, since > > from the core pieces, each level of opaqueness influences the > > opaqueness of the whole by its morpheme boundary level. > > But multiplying the nonzero values would give a lower opacity > score for "eldonejo" than for "eldoni", when in > fact "eldonejo" is slightly more opaque than "eldoni". > And if we multiply all values then any word that has at least > one perfectly transparent morpheme boundary > would get a perfectly-transparent opacity score of 0!

Oops! That's not want I wanted.

> Maybe it would be better to multiply the > _transparency_ scores rather than _opacity_ scores, > > (1 - n_0) * ( 1 - n_1) * (1 - n_2 ).... > in this case, > (1 - 0.95) * ( 1 - 0.20 ) * ( 1 - 0 ) > = 0.05 * 0.80 > = 0.04 (transaprency) > > and then subtract that from 1 to get its > opacity score, = 0.96.

You are absolutely right, that's much more sensible. I had actually mixed up the two levels. But we might agree that this type of math may be mainly for fun anyway. :-)

> > > "eldoni" and "eldonejo" in the lexicon to inflate > > > the count too much since the latter builds on > > > the former and is almost transparent if you already > > > know "eldoni". ... > > > > This is more tricky, yes. In the lexicon an Þrjótrunn, I have an > > operation that cuts off parts of an existing entry for construction of > > a new one. Maybe that would be feasible? > > Can you clarify further?

Well, it is currently a simple string operation -- not linguistically founded, but still helpful for linguistics: you could chop off the last three characters of 'eldonejo' and use the stub 'eldon' for further operations.

> I think Alex Fink's suggestions were probably > along the right lines, at least vis-a-vis lexicon > counting: count only the outermost branching.

But when the result of previous branching steps are not part of the lexicon, e.g. because two morphemes are added to form a new word while adding only the first one leaves you with garbage, then it's not the best way, I think. However, I would propose to multiply all boundaries not resulting in anything already in the lexicon so that you get a recursive derivation tree. E.g. if you have ABC in the lexicon already and want to add ABCDE and if ABCD does not exist, the either assign the operation +DE one score and use this for a lexicon entry, or multiply the scores of +D and +E. What this will give you is a score for deriving this word from some shorter word in the lexicon. Yes, this is probably what you want for lexicon counting. And multiplying all scores will give you the score for deriving that word from scratch, i.e., from roots only. **Henrik

Re: Most developed conlang

Reply