Re: Avoiding near-collisions in vocabulary coinage

From:	Jim Henry <jimhenry1973@...>
Date:	Tuesday, August 5, 2008, 14:56

|< < Post > >| << List/Tree >> Reference August 2008 Index

On Mon, Aug 4, 2008 at 7:18 PM, Eldin Raigmore <eldin_raigmore@...> wrote:

> One is counting Wickelphones.  (A Wickelphone is a set of three consecutive
> phonemes that occur in a word; plus one for the first two phonems of the
> word and one for the last two phonemes of the word.  So "haplology" would
<snip>

That sounds useful.  Do you have or know of a script
to apply this wickelphonic-similarity calculation to a set of words?

However, the vast majority of gzb roots being only 2-4 phonemes long,
it might be less useful for gzb than for other languages.

> Another is based on the fact that words with the same sounds in the first four
> or last four phonemes of the word are likely to be confused with each other.
In gzb nearly all morphemes are one syllable, and no syllable
is more than five phonemes (the average is 3.36 phonemes per
root morpheme).  So for nearly all words, all phonemes
would be relevant for this kind of similarity calculation.  (There's only one
root in the lexicon with more than 8 phonemes, {θrî'sě'kjurn} "ibis".)

My present technique just looks for words that have similar phonemes
in each or any slot in the word.  So (simplifying, and using Kalusa
phonology instead of gzb phonology), if I were checking to see if
a potential word "kalu" were too similar to an existing word, I would run it
through a script that turns it into the regex

/^[kg][ae][lr][uo]$/

and then searches the lexicon for words matching that regex, which
would turn up (if they existed) "galu", "gero", "karu", etc., along
with their glosses, and I could decide if any of them really
sounded too similar to "kalu" and also had too-similar meanings.

> So, here's my suggestion (and I'm just going to assume that both words have
> the first and last syllables stressed):
> Give the pair 14 points if they have the same first phoneme.
> Give the pair 13 more points if they have the same last phoneme......<snip>

Do you have a script to apply this calculation you describe?

--
Jim Henry
http://www.pobox.com/~jimhenry/

|< < Post > >| << List/Tree >> Reference August 2008 Index