Re: Using word generators (was Re: Semitic root word list?)
From: | Jim Henry <jimhenry1973@...> |
Date: | Wednesday, January 10, 2007, 19:06 |
On 1/9/07, David J. Peterson <dedalvs@...> wrote:
> The only problem I have found with this approach is that it can
> lead to an unbalanced phonology. For example, especially with
> my language Njaama, the bilabial and palatal click (which, admittedly,
> were not in the phonology from the beginning) rarely make an
> appearance (this became glaringly apparent when I participated
...
> As a result, if I haven't got a good idea how I want a word to
> sound, I tend to look around and see what phonemes are
> underrepresented, and make sure they pop up in the word I'm
> creating. It's not perfect, but it does help to prevent the same
> phonemes from being used over and over again.
I don't use auto-generated words for gzb, but I regularly use
a script to generate a histogram of the most common
initial phonemes, onsets, and rimes in the dictionary, and
use that as a reference when coining new words, looking
to use the so-far less common phonemes and combinations
when that fits the sense of the word.
For a certain engelang that's been on the back burner for a
while, use of auto-generated vocabulary is an essential
design feature, though I've manually tweaked some parts
of it. The final version of it may have more manually
selected word forms and fewer auto-generated ones, but
the general plan is to regenerate the vocabulary periodically
whenever the size of the corpus passes some new
milestone, giving the shortest forms to the most
frequent words.
--
Jim Henry
http://www.pobox.com/~jimhenry/gzb/gzb.htm