Re: A Self-segregating morphology (was: Guinea pigs invited)
From: | Gary Shannon <fiziwig@...> |
Date: | Friday, December 16, 2005, 18:23 |
--- Jim Henry <jimhenry1973@...> wrote:
<snip>
> This is similar to Jeff Prothero's "Plan B"
> and the projects discussed by Ray Brown
> and Jörg Rhiemeier in the thread "brz, or Plan B
> revisited"
> back in September -- except there, it was
> the initial consonant that determines the
> number of phonemes the listener is
> to expect in the word.
I missed that. I was no-mail status in August and
September.
<snip>
> > Vowels are irrelevant to the identity of the word,
<snip>
> Hm... So if you have a word "kanu" you
> could not also have a word "kani"?
Given 20 consonants (using "C" for "CH" and "X" for
"SH", and discarding "Q") and vowels treated as
irrelevant, there are 8,000 3-consonant words and
160,000 4-consonant words. Longer than that and the
words number in the millions and billions. That should
be adequate so that if, for example, "kanu" were used
it would not be necessary (or even desirable) to have
a word as similar as "kani".
<snip>
> Alternatively, you might (allowing free choice of
> vowels
> in morphemes after the first syllable) set a
> _default_ vowel to follow each consonant, which
> would save space while writing but still give you
> a lot more potential morphemes at any
> given length than your original scheme.
That could be done, or some decoration placed above
the consonant to indicate a different vowel (as in
Devanagari, for example). But with so many possible
words to choose from that might only be necessary in
transcribing proper names or foreign words. Another
mark, perhaps under the consonant, might indicate
"-l", "-n", "-r", or "-s" suffixed to the vowel so
that PKT would be "pekato", but with a certain
decoration under the "K" might become "pekanto", or
with a different decoration, "pekalto", "pekarto", or
"pekasto".
That would provide for a huge increase in number of
possible words. The inventory of symbols would consist
of 20 consonant symbols, 4 vowel marks, any one of
which could optionally be placed above any consonant
and 4 modifier marks that could be placed below. That
would make for 25 written forms of each consonant,
generating 500 single syllables, 250,000 words of 2
syllables, 125 million words of 3 syllables, and
billions and billions of words 4 syllables long.
That's probably WAY overkill!
(That system could almost be a syllabry for English.
E.G. "BY" with appropriate modifier marks could be
pronounced "BilYon", a good approximation of the
English "billion". Or "HPX" with the right marks would
be "HaP'Xun", not a bad approximation of "option". )
<snip>
> I'm not sure, but I suspect that self-segregation
> at the morpheme level is more important than
> self-segregation at the word level.
That certainly applies to inflected and agglutinating
languages, but I had a "pure" isolating language in
mind when I dreamt this up. In that case each word is
learned as an unvarying unit which has only one form
and does not need to be parsed internally. While
knowing the derivation of a word from its roots might
be an interesting sidebar, it would not hinder
comprehension for a speaker of the language to be
completely unaware of a word's roots. How many
ordinary English speakers would think to parse
"excavate" into its Latin roots? They know the word,
but as a unique unit in its own right, not as the
compound it actually is.
In practise I suspect word boundries are seledom a
problem, and so the self-segregating feature of the
system is probably superfluous, but it's an
interesting theoretical consideration.
--gary
<snip>