Re: Self-segregating morphology again - in simpler terms, with list of methods
|From:||And Rosta <and.rosta@...>|
|Date:||Friday, April 21, 2006, 0:26|
Jim Henry, On 17/04/2006 19:53:
> On 4/17/06, And Rosta <and.rosta@...> wrote:
>> My conlang, Livagian, has unambiguous syntax parsed
>> incrementally with no lookahead, and it cuts the
>> Gordian knot of self-segrating morphology by extending
>> the input to the syntactic parser to the level of the
>> syllable (or potentially the segment). As each syllable
>> is read in, the syllable is looked up in the lexicogrammar,
>> The lexicon necessarily contains instructions for how to
>> deal with every string of syllables.
>> The upshot is that a sentence can't necessarily be parsed
>> into words or morphemes on the basis of its phonological
>> form alone, but a sentence can be fully parsed on the
>> basis of its phonological form and the lexicogrammar,
>> without there being a need for self-segregating morphology
>> or for the complexities or constraints on morpheme shapes
>> that self-segregation schemes impose.
> Let me make sure I understand. You don't constrain
> the shapes of the individual morphemes so that they're
> inherently self-segregating -- but it seems you must
> constrain them relative to each other so that no morpheme
> looks like a prefix or suffix of another morpheme.
> So e.g., if there were a monosyllabic word "vek",
> there couldn't be any two or more syllable word that
> starts with "vek", but there's no reason /v/ or /e/ shouldn't
> occur in the middle or end of any word.
This is quite correct. But the rules can be sensitive to
grammatical environment. So in one part of a sentence /vek/
might parse as a whole word, and in other part it might parse
as the initial portion of a word, and in yet another part it
might parse as /ve/+/k/. To give a concrete example, in
Livagian any word that can begin a number expression kicks
into local operation a completely new lexicon (one that
contains only entries corresponding to mathematical
entities); in this lexicon, you get single-segment words.
In principle, there are no phonological generalizations to
be made about word/morpheme boundary placement. (In practise,
Livagian does in fact use phonological criteria (tonal) to
guide the parser in boundary insertion, but this is simply
for the sake of convenience/simplicity/regularity, and is
not a necessary element of the scheme.)
> I'm thinking that I might probably impose a constraint
> like this on my next conlang over and above a
> self-segregation rule -- or perhaps instead of such.
> Self-segregating morphology is probably of benefit
> primarily to beginning learners of a language, whose
> vocabulary is still small.
IMO, the only respect in which self-segregation is useful
is that it is a necessary condition for unambiguity. I
don't believe it could have a positive impact on ease of use.
> But your rule would be
> helpful to more experienced speakers when
> looking up the occasional unknown word. They
> would not encounter words that look like they might
> be a compound of one word they already know
> and another word they don't know, which turn out
> to be irreducible roots instead (this occasionally
> happens to me in Esperanto).
The basic scheme I outlined -- in which nothing can be
predicted from phonological shape and boundary placement
is completely reliant on the lexicon -- can't deal with
unknown words. (For proper names, you could have a
proper name marker that kicks into local operation a set
of rules guided by phonological shape.) It might
therefore be prudent to have a default fallback rule,
such as "If the current word has reached three syllables
and there is no instruction in the lexical entry for
this three-syllable string to parse in a further
syllable, then assume that the three syllables constitute
a complete but unknown word". Of course, you'd also need
default rules about how to handle unknown words