THEORY: What's a "word"? What's "Polysynthetic? Self-Segrating Morphology, and Milewski
|From:||Eldin Raigmore <eldin_raigmore@...>|
|Date:||Wednesday, May 17, 2006, 20:58|
I have read "The Conception of the Word in the Languages of North American
Natives" by Tadeusz Milewski, Lingua Posnaniensis III  pp. 248-267.
[WHAT'S A "WORD"]
As many on-list know, the question "what is a word?" is one which, in some
languages, has no single answer; and it has no universally accepted cross-
linguistically adequate answer.
Milewski takes the point of view that some languages have words and some
He divides the meaningful elements of a language, by size, into "morphemes"
(sorry, David, that's what the man does), "clauses" (he didn't investigate
anything bigger than a clause in this article), and "syntactic groups"; by
which he meant, polymorphemic parts of clauses.
He says that a language has "words" if it has a class of syntactic groups
whose boundaries are established by exactly the same means regardless of
where they appear in the clause.
Thus he is open to the possibility that this means of establishing word-
boundaries may vary from language to language. Some are established by
pitch-tone; some by volume-stress; some by length; some by other
suprasegmental phonological means; some by other phonological means; some
Among interesting conclusions he reaches is that Modern French doesn't have
words. According to Milewski, within the predicate or within the VP, the
ancient, inherited system of dividing the utterances into words, still
applies in MF; but outside the predicate and outside the VP, there is no
use of those methods to divide the utterance into words; so oral/aural
_spoken_ Modern French is not a "word" language.
Does everyone's conlang have "words"?
If not, whose doesn't, and which conlangs?
Does anyone have a conlang that does have "words", but whose "definition"
of what exactly a "word" is, is "weird"?
According to Milewski, there's nearly a segue from agglutinating languages
into polysynthetic ones.
The dividing line, he says, is that, in agglutinating languages, each word
may contain _only_ _one_ lexically "full" morpheme -- the root. That is,
in agglutinating languages, in each word, at most _one_ morpheme
has "content"; the others (if any) are "grammatical", or "light",
By contrast, in polysynthetic languages, a word may contain _any_ _number_
of "content" or "full" morphemes.
As a result, a word may grow to enormous size, and approximate an entire
clause in its meaning.
Whose conlangs are polysynthetic?
And which conlangs?
For those conlangs, what exactly did their conlangers mean
Naturally, Milewski mentions several of the methods languages use to decide
which strings of morphemes constitute a "syntactic group".
(Jim Henry, you might be especially interested.)
1. Only one "content" morpheme per word -- the root -- and it must be the
first morpheme in the word. A word begins with a root and continues until
either the end of the clause, or just before the next root, whichever comes
first. I suppose if there are morphemes in a clause before the first root,
they form, together, a grammatical, "light" or "empty" word.
Milewski mentions Turkish as one of these languages.
2. There may be a primary stress on the first or second or third, or last
or next-to-last or third-from-last, syllable of each word. Each particular
language picks a syllable-position, and has this primary stress fall on the
same syllable of each word, unless the word is too short.
2a. In sufficiently long words, some of these languages have a secondary
stress. If the language puts the primary stress is on the first or second
or third syllable, it puts the secondary stress on the ultimate or
penultimate or antepenultimate syllable; if it puts the primary stress on
the ultimate or penultimate or antepenultimate syllable, it puts the
secondary stress on the first or second or third syllable. Milewski lists
Zuni as an example of such a language.
3. There may be lexical tone on the last syllable of a word but not on the
4. In Quileute, for instance, there are three kinds of morphemes; those
that can be the first morpheme of a polymorphemic word, but never a non-
first morpheme (and not often be independent words); those that can be
independent words, but never a non-first morpheme (and not ofen a first
morpheme of a polymorphemic word); and those that can be subsequent
morphemes of polymorphemic words, but never be first morphemes, nor be
(BTW another sometimes-difference between languages like Turkish that
Milewski calls "agglutinative" and those languages which Milewski
calls "polysynthetic", some of which follow the Quileute system of word-
boundary-marking, is that the latter may have _two_ semantically-equivalent
lexically-"full" morphemes for a given semantic content -- one a "first"
morpheme, and the other a "subsequent" morpheme -- so that the same idea
can be expressed at any place in the word. [The lack of a
suitable "subsequent" morpheme is often the reason limiting the length of a
word.] In "agglutinative" languages, on the other hand, the root morpheme
has to occupy a single position -- first, in the case of Turkish.)
5. Kwakiutl, e.g., makes use of both the Quileute-like system mentioned in
(4) and of the stress system mentioned in (2).
6. Tonkawa is very weird. In the root -- but only in the root -- the _odd_-
positioned vowels appear in "full grade", while the _even_-positioned
vowels appear in "weak grade". However, to determine parity (oddness-or-
evenness) of the position, one counts from the beginning of the _word_
rather than from the beginning of the root. So if the prefix(es) have, or
add up to, an even number of syllables (for instance zero syllables or two
syllables), the root is pronounced one way; but if it/they have an odd
number of syllables (in sum), the root is pronounced another way. There
may also be suffixes. He gives the example of the root "yamaxo" "paint
someone's face", which appears as "yamx-o?" "he paints his face" and as "ke-
ymax-o?" "he paints my face".
7. Yokuts is very nearly a triconsonantal language. Most roots are
bisyllabic. Most roots are of type CVCVC; several are of type CVCV. Most
suffixes are of type CVC or VC. Several three-letter roots appear to have
been formed by lexicalizing the combination of a two-letter root followed
by a suffix.
Everything "ObConLang" I could say on this topic has been said before; I
think it's pretty obvious how these preceding paragraphs apply to
The same sorts of techniques are used to separate syntactic groups of
morphemes from each other in some of the non-word languages; but, in them,
one technique is used in the VP or predicate; a different technique is used
in some of them before the VP or predicate; and some of them use a
different technique after the VP or predicate. So Milewski doesn't
consider these languages to have words.
For example in Tunica, each group before the "predicate" (I think he might
mean "verb" in this case) ends on a rising tone; each group after
the "predicate" ends on a falling tone; and in the VP itself, the verb can
end on any of four tones depending on its mood. (So "mood" is marked by
a "suprafix" in Tunica, instead of a prefix or suffix or infix.)
Do any conlangers have conlangs whose systems the above two paragraphs
remind them of?
Tell us about them.