Theiling Online    Sitemap    Conlang Mailing List HQ   

Re: Phonologically redundant vocabulary

From:Jim Henry <jimhenry1973@...>
Date:Thursday, April 13, 2006, 14:22
On 4/13/06, Henrik Theiling <theiling@...> wrote:

> Jim Henry writes: > > A while ago there was a thread about using phonologically > > redundant vocabulary (no minimal pairs). I've been working > > (intermittently) on methods and scripts to generate lists > > of such words. I started writing something which turned > > out to be a bit long for a listgroup post, so here it is as > > an article on my website: > > > >
> Nice stuff. :-) > > At the time of the thread, I was also thinking of a engelang with a > self-regregating morphology plus redundant word building. These > are nice tools for implementing a beast like that.
Cool. I'm sure our languages will
> Thanks for sharing!
I'll be updating the article, and most likely the scripts, again in a few days -- there are points that I forgot to cover re: the format of the input files, an algorithm I haven't yet implemented for efficiently searching for strings with at least three (or more) points of difference, the need of stricter redundancy criteria for longer words, etc. (For instance, the basic requirement to have no minimal pairs can result in sets of 3-syllable CV(n)CV(n)CV(n) words that include very similar subsets like: nokunpun jakunpun kikunpun tukunpun lunkunpun sonkunpun sipunpun nanpunpun jenpunpun unpunpun linpunpun All differ by two phonemes, but all share two syllables in common. John E Clifford, in offlist correspondence, has suggested that maybe one should require words to have no entire syllables in common. I'm not yet sure how to restate that in terms of a minimum number of characters different; maybe for strings of 9 characters, a minimum of 6 characters different would be equivalent to a minimum of 2 characters different for strings only 3 characters long. I have a vague idea how to more efficiently search for strings with 3 or more characters different, but I suspect it will be an exponential slowdown from the 2-character search script. Basically, if for a 2-character redundancy search you block off all the cells in the same row, column and stack as the cell representing a word you've picked, and then move diagonally in the same plane to look for another open space -- for a 3-character redundancy search you would block off all cells in all the _planes_ that intersect at the chosen cell, and then move diagonally (meta-diagonally?) into another plane in the same cube... Writing the code for that search in an arbitrary number of dimensions will be hairy, and writing code for searching for an arbitrary minimum number of characters different will be even worse. -- Jim Henry


Henrik Theiling <theiling@...>