Re: Most developed conlang
From: | Jim Henry <jimhenry1973@...> |
Date: | Monday, April 23, 2007, 14:35 |
On 4/23/07, Alex Fink <a4pq1injbok_0@...> wrote:
> On Sun, 22 Apr 2007 18:47:58 -0400, Jim Henry <jimhenry1973@...> wrote:
> >On the other hand, transparency/opacity is a
> >continuous rather than a boolean quality. Some
> >"transparent" compounds are more tranparent
> >than others, some "opaque" compounds are more
> >opaque than others; and the same is true of
> >idiomatic phrases. So maybe the semantic transparency
> >field gets real numbers ranging from 0.0 to 1.0, and
> >the overall word count for the language would probably
> >be non-integral.
....
> >rather than the word level. For instance, in E-o
> >"el-don-ej-o" there are three morpheme boundaries,
> >one perfectly transparent (ej-o), one somewhat
> >transparent (between el-don and -ej), and one
> >almost completely opaque (el-don). We might
> >assign them transparency (or rather opacity)
> >scores of
> >
> >el- don -ej -o
> > 0.95, 0.20, 0.0
> >
> >or thereabouts. How would we combine these to
> >get an overall opacity score for the word?
......
> What's the problem here? Only the outermost opacity should count, if you
> assume the branching is binary so that there is an outermost derivational
> operation. In this case I gather the base of <eldonejo> is <eldoni>; so
> <eldon-> counts for 0.95 of a lexical item, <eldonej-> for 0.2, and
> <eldonejo> for none (if you reckon it in your count at all, which is a moot
> question).
OK, that makes sense.
> Overall, though, I like this idea of non-integral counting, making opacity ~
> compositionality of a derivation, or listemicity of an item, a fuzzy
> concept. Now if only there were some way to systematically make statements
> like "the opacity of the derivation 'speak' > '(loud)speaker' is 0.6931"...
Here's an impractical method that seems theoretically valid to me:
1. Give 100 people who aren't speakers of the language in question
a set of definitions of root morphemes.
2. Give them a list of compounds using those morphemes. Ask
them to guess what the compound words mean.
3. The opacity score of each compound word is =
(100 - number of people who correctly guessed its meaning) / 100
So if nobody guessed it right it is 1.0, completely opaque.
If only 5 out of 100 people guessed its meaning, it would
get a score of 0.95. If 80 people guessed it, 0.20; if everyone
guessed it, 0.0 (perfectly transparent).
I said it's impractical, thinking it would be too expensive to
poll as many people as necessary to evaluate every compound
word in a natlang or even a sizable conlang; but maybe you
could make it a game like the Google Image Labeler and actually
get a statistically significant number of people to participate.
--
Jim Henry
http://www.pobox.com/~jimhenry/gzb/gzb.htm
Reply