Theiling Online    Sitemap    Conlang Mailing List HQ    Attic   

Avoiding near-collisions in vocabulary coinage

From:Jim Henry <jimhenry1973@...>
Date:Monday, August 4, 2008, 16:21
How do y'all avoid coining new words in your conlangs that are
too similar to existing ones?  -- In a naturalistic artlang, whether a
or a priori but derived diachronically from an earlier a priori
artlang, you'll have homonyms and near-homonyms result from sound change
merges, of course; it's expected and desired.  But for an engelang
or a more abstract, less naturalistic artlang, what methods have you
found useful to avoid creating new words that sound too similar to existing

I've used three methods for various conlangs:

1. With most of my conlangs, I simply search the lexicon for
various substrings of a potential new word before
deciding definitely on its form.

2. With säb zjeda, I generated a large list of potential word forms
with a Perl script that produces phonologically redundant forms
(no two are minimal pairs), and then partly automatically, partly
manually assigned meanings to those forms.

3. With gzb, I've been using (for the last couple of years)
a Perl script that takes a potential new wordform, converts
it into a regex to search for similar words, and then searches
the lexicon with said regex to display potential conflicts.
I can then decide if those potential conflicts are really
similar enough to worry about (which depends partly
on the phonological similarity, partly on their semantics,
whether they're apt to occur in the same context).
I also use a couple of scripts that tell me the most and
least common onsets and rimes in the lexicon, to give
me ideas for  potential new wordforms that are
unlikely to conflict with existing ones.

That script is in
if anyone wants to adapt it to their own conlang's
orthography and their lexicon's format.

Have any of you other ideas for how to manage this

Jim Henry
Conlang fluency survey -- there's still time to participate before
I analyze the results and write the article


Michael Poxon <mike@...>
Henrik Theiling <theiling@...>