USAGE: a possible IE clade (family tree)
From: | John Cowan <jcowan@...> |
Date: | Friday, May 30, 2003, 17:54 |
I've now read the papers written by the UPenn folks, and I'm much
less skeptical than before. For fun, here is the IE cladogram they've
assembled:
IE
Anatolian
Non-Anatolian
Tocharian
Non-Tocharian
Italo-Celtic
| Italic
| Celtic
Non-Italo-Celtic
Greco-Armenian
| Greek
| Armenian
Satem Core
Indo-Iranian
| Indic
| Iranian
European Satem
Germanic
Balto-Slavic
Baltic
Slavic
As usual, God (or the Devil) is in the details. Of the 229 shared
characters they looked at, 148 were disregarded as being consistent
with every possible tree. Another 17 had to be dropped because they
involved multiple root words in IE differently preserved in the different
branches, or else parallel sound changes or semantic shifts (typically
of a "natural" kind).
The Indo-Hittite hypothesis depends on only a single shared character (the
form of the aorist in thematic verbs), and if that character were to be
impugned, the top-level split could leave us with Anatolian-Italo-Celtic
or Anatolian-Italo-Celtic-Tocharian.
Italo-Celtic is supported by three characters, Greco-Armenian by five,
and Indo-Iranian and Balto-Slavic by more.
What is truly weird and unexpected is the presence of Germanic within
the Satem Core, alongside Balto-Slavic and Indo-Iranian. When they ran
the original data through the tree-building algorithm, all possible
trees were inconsistent, in the sense that no tree could place each
character in only a single subtree of the tree. With Germanic removed,
all inconsistencies disappeared!
Of the various ideas they looked at, the best seemed to be that Germanic
split from its neighbors in the East, and then migrated to the West,
where Germanic-speakers came into contact with Italo-Celtic speakers
and *borrowed much of their vocabulary*, while retaining the Satem-type
morphology. The comparative method of course cannot tell the difference
between this sort of borrowing and ordinary sound-change, because it
extends across almost the entire language. Other explanations did not
admit such a straightforward solution.
The position of Albanian is still open to question. The data are
consistent with its being a sister of Non-Tocharian, Non-Italo-Celtic,
Italo-Celtic, Greco-Armenian, or the Satem Core. The UPenn team used
20th-century data for the language, and Albanian is not only not very
conservative, it has layers and layers of Iranian borrowings.
Disclaimers: As I said before, the choice of characters makes or breaks
such trees, and AFAIK they have not published their list of characters,
so if some of them are factually bogus or turn out to be shared primitive,
these results can be overset. However, they do state that most characters
are founded on the comparative method, rather than on Greenberg-style
mass comparison, which is very encouraging.
In addition, the algorithm generates trees with no root. The position
of the root was determined by adding assumptions, inherently more shaky,
about the natural directionality of sound changes and morphological shifts.