Re: Self-segregating Semitic Morphology
From: | Logan Kearsley <chronosurfer@...> |
Date: | Monday, September 8, 2008, 16:13 |
Having slept for a while, I think I've got some good answers to my own
question. Sleeping is good for solving a lot of problems.
It's still good to have alternate suggestions, though; the stuff my
subconscious came up with definitely doesn't constitute the only
option.
On Mon, Sep 8, 2008 at 9:01 AM, Jim Henry <jimhenry1973@...> wrote:
> On Mon, Sep 8, 2008 at 12:53 AM, Logan Kearsley <chronosurfer@...> wrote:
>> Thought 1- building vocabulary based on consonantal roots allows for a
>> large and powerful derivational system without having to resort to
>> long strings of agglutinating affixes.
>> Thought 2- self-segregating morphology is kinda cool.
>>
>> It would be neat if these two ideas could be combined. Unfortunately,
>
> I see a couple of obvious ways to do it.
>
> 1. Use one set of consonants for your triliteral roots, and another
> set of consonants that can occur in suffixes. There are no prefixes.
> Any time you encounter a consonant from the set of root consonants
> following one or more suffix consonants and vowels, you've found the
> beginning of a word, and any time you find a consonant from the set
> of suffix consonants, you've found the beginning of a suffix. If you find
> a bunch of root consonants in a row separated only by vowels,
> then the first, fourth, seventh etc. indicate the start of a new word.
>
> 2. Or allow prefixes too, and have a third set of consonants used
> only in prefixes.
If the roots in the class that can have vowel-pattern derivations
applied are all the same length (which is a basic assumption for how
this sort of thing works), I don't think you need three sets of
consonants. Say that Set 1 is for prefixes, and Set 2 is for roots and
suffixes. Then, a transition from S1 to S2 marks the beginning of a
root, which continues for a known number of consonants, and everything
after that until the next word must be suffixes. A transition from S2
to S1 marks the boundaries between words, but that could fail if there
are no prefixes on the next word; that could be solved by mixing the
consonant types in roots- requiring that the first one be from S1 and
the later ones from S2.
> That limits your options on root vowel patterns; you couldn't have
> vowels before the first root consonant, unless they were part
> of a prefix. You could have e.g.
>
> CaCiCu
> CCuCa
> CiaCCi
> CaCauC
>
> etc., with suffixes of form CV+, but not root patterns like
>
> aCCiC
> uCiCaC
>
> etc.
Mm... I'm not seeing why.
It definitely *does* limit the number of consonantal roots, which
could be annoying depending on how many consonants you start out to
work with; but that's a different concern from limiting the
derivations patterns available.
> Suppose you have 20 consonants and 5 vowels, with
> 15 consonants allowed in roots and the other 5 in suffixes;
> that gives you 3375 possible roots and 150 CV and CVV
> suffixes. Not sure offhand how to calculate the possible
> root vowel patterns, but there should be scads of them too.
Number of patterns = (n+1)^(m+1)-1, where n is the number of vowels
available for use in derivation patterns, m is the number of
consonants in a root, all consonant clusters are allowed, and we
assume that there are no strings of multiple vowels and that there
must be at least one vowel somewhere in a word. There are scads even
for fairly small numbers of vowels.
However, modifications are in order to account for additional
restrictions (like, root words can't start with vowels, or root words
must start with vowels, or 3-consonant clusters aren't allowed, etc.),
and each one of those tends to drastically reduce exactly how many
scads you get. Relaxing the assumption that there are no strings of
vowels actually doesn't matter much, because it's equivalent to just
increasing the vowel inventory.
Assume a 4 vowel system with triliteral roots, for illustrative purpose.
(n+1)^(m+1)-1 = 5^4-1 = 624 derivational patterns, more than Arabic, I think.
If you require that root words don't start with vowels, then it becomes:
(n+1)^m-1 = 5^3-1 = 124 derivational patterns, a lot fewer than Arabic.
If 3-consonant clusters aren't allowed:
n*(n+2)*(n+1)^2 = 6*5^2 = 600, which is pretty good (I think; I'm not
absolutely sure that I derived that last expression correctly).
On Mon, Sep 8, 2008 at 9:53 AM, R A Brown <ray@...> wrote:
[...]
> ...and a third method might be along the lines John Cowan outlined for
> xuxuxi:
>
> {quote}
> xuxuxi uses vowel harmony/disharmony to resolve the problem.
> All multi-syllable words are stressed on the first syllable,
> and then the other syllables of the word, except the last,
> have vowel harmony. The last syllable of the word has disharmony.
> Any remaining syllables before the next stressed syllable are
> monosyllabic.
That's the sort of thing that I would count as drastically reducing
the number of patterns available, because it restricts the vowel
inventory available for use in any particular word.
> Here's the harmony/disharmony table:
>
> first medial last
> a a, e, o i, u
> e a, e, i o, u
> i a, e, i o, u
> o a, o, u i, e
> u a, o, u i, e
Time for more math to see if this is really as restricting as I think it is.
You've got 5 initial vowels, 3 internal vowels, and 2 final vowels for
any case. If there's only 1 vowel it must be initial, if there're 2
vowels they must be an initial and a final, if there are three or four
vowels, you get all three choices. Vowels can appear in four positions
around a triliteral root.
VCCC CVCC CCVC CCCV 4*5 +
VCVCC VCCVC VCCCV CVCVC CVCCV CCVCV 6*5*2 +
VCVCVC VCCVCV VCVCCV CVCVCV 4*5*3*2 +
VCVCVCV 5*3*3*2
= 290
Barely more than what you get with 3 unrestricted vowels, and a small
fraction of what you get with 4, let alone 5.
On Mon, Sep 8, 2008 at 9:57 AM, Lars Mathiesen <thorinn@...> wrote:
[...]
> Looking up self-segregating morphology on Conlang Wikia, it looks like
> the accepted definition is that morpheme or word boundaries should be
> immediately obvious without full knowledge of the lexicon.
>
> In a semitic style language, the morphemes of a word aren't combined
> in sequence, so they don't have boundaries as such. You may be
> thinking that you need to make self-segregating _syllables_, but I
> don't think that serves any purpose in this context. You will probably
> have to come up with another definition of self-segregating to be able
> to play.
I was mainly thinking of self-segregating words. Yes, obvious internal
segregation can't easily be applied to derivation patterns that may
contain multiple morphemes, so I wasn't worrying about that. It would
be nice, in addition to segregating words, if you could segregate
affix morphemes from the roots as well, but that's a secondary
consideration for me.
> And if you want self-segregating phonological words, you're actually
> better off than with sequential morphemes. Self-segregation (or
> self-synchronization in general coding theory) needs redundancy, which
> is the same as 'populating the pattern space sparsely'. And since your
> words are known to be built on tri-consonantal roots, marking your
> word boundaries will only need about the same redundancy per word that
> syllable-based schemes do per syllable, so your pattern space can be
> more densely populated.
That's a good point. In the limit where there are no affixes and *all*
words in the language have the same root structure, segregation is
trivially easy; every set of three consonants is a new word, and you
just need some way of occasionally disambiguating whether a stray
vowel goes with the last word or the next (although that system is
fragile; it depends on knowing exactly where the speech-stream starts;
if you come in in the middle, you'll be out of synch, and we'd like to
have some way of fixing that). However, that's a really ridiculous
limit. We'd probably like to have the occasional grammatical particle
or anaphor or something that has it's own form distinct from the root
derivation system.
One option became Exceedingly Obvious just after I woke up this
morning- mark word boundaries with successive vowels. If you've
already got the assumption that derivation patterns only use solitary
vowels, then it's natural to think that two vowels in a row must
belong to separate roots.
This requires the restriction that every word must begin and end with
a vowel, which means that the only single-syllable words will be
single vowels, and the number of possible derivations is restricted
to:
n^2*(n+1)^(m-1), assuming that all possible consonant clusters are
allowed so you don't need any internal vowels.
With 4 vowels and triliteral roots, that results in 400 derivation
patterns. Not bad. If we require at least 1 internal vowel, then we
get:
n^3*(n+2), using triliteral roots, which comes out to 384. Still not
bad. Using 5 vowels gets us up to 875 (out of a total possible of
1295... which might be overkill), which is quite good.
I'm not sure how I like the aesthetic of every word beginning and
ending with a vowel, but it does work nicely. And it allows for the
use of shorter roots mixed in to the language as well.
This restricts the form of prefixes to VC{C}, and suffixes to {C}CV
(although, if the clusters are allowed, you could have
single-consonant infixes which hijack the already-present initial and
terminal vowels as well). And it very nearly requires that you only
use one or the other, but that is fixable by designating a consonant
(or class of consonants) to mark the boundaries of an affix list (as
discussed above); pick the clusters right, and that doesn't even
require adding an extra syllable.
-l.
Replies