Theiling Online    Sitemap    Conlang Mailing List HQ   

Re: New Word Generator

From:Petr Mejzlík <imploder@...>
Date:Saturday, September 29, 2007, 20:21
Hi there! Post from LISTSERV are coming since I signed up so it seems to
finally work. Now I try to repost my two replies Yahoo had rejected
yesterday. This is the first:

--- In conlang@yahoogroups.com, Benct Philip Jonsson <conlang@...>
wrote:

> A great generator, and very similar in its external to one I > wrote in perl but never put online :-)
I'm glad that so many people like it. Maybe you should put your program online - there apparently are conlangers who'd have a use for it and seek for a good generator. I don't like generators much because they usually give rather awkward results and need precise instructions. One that could learn automatically would be amazing, but I guess that's too far-fetched and really complicated to make. OTOH people can get used to a new phonotactics intuitively so a computer should be able to do it as well (and it's not about meaning so it shouldn't require human knowledge). It would have to somehow identify and classify the phonemes with some really neat statistical algorithm. Not that I know how to do it though.
> I'd like to know how the nested parentheses thing works...
It works very simply, like parenthses in maths. What's inside a pair of parentheses is treated as a single unit - as if it was a single letter. You can further combine these units and make larger units containing them. A block bordered by [ and ] will always behave like one letter, no matter that it can be rendered as ten letters or even more. Round brackets do the same except that what's inside them either appears in the word or not.
> The only two things I really miss are >* to have a check so that the same word isn't generated more > than once.
I think it's not a big problem when the words are longer and the number of permutations is higher. The probability of being generated is not the same for all words. It depends on the pattern. Usually the shorter ones are made more often, which is good since otherwise there would be almost only very long words as they make the vast majority of the total number of permutations.
>* to have a list of forbidden combinations. >E.g. my conlang Kijeb doesn't allow any of the combinations > in the list >yi/iy/wu/uw/nm/ñn/ñm/ñy/nn/ss/tt/td/rr/yy/ww/yr/kwr/gwr >which may arise from the simple settings >V:a/i/u/ya/yu > C:p/t/k/kw/b/d/g/gw/f/s/x/m/n/&#241;/r/w/y > F:s/t/n/r > r:((s)C(r))V(F)(CV(F)) > n:100 > nle >Rather than making a complicated pattern to avoid such a > short list of exceptions it would be better to simply list > them and test the output against the list as a regex > alternation. In perl simply add a test >unless( $word =~ > m{(yi|iy|wu|uw|nm|ñn|ñm|ñy|nn|ss|tt|td|rr|yy|ww|yr|kwr|gwr)}) >around the line putting the word into the hash.
Checking the whole word for forbidden combinations and regenerating it where there are any is a good idea. I've considered to make the program able to recognize some kind of anti-pattern but it would perhaps require a completely new algorithm for checking patterns, which seems more difficult than to generate. A simple list is easy and perhaps more practical, thanks!

Replies

taliesin the storyteller <taliesin-conlang@...>
Benct Philip Jonsson <conlang@...>