Re: The Language Code
From: | Dirk Elzinga <dirk_elzinga@...> |
Date: | Thursday, May 22, 2003, 20:10 |
On Thursday, May 22, 2003, at 09:44 AM, H. S. Teoh wrote:
> On Wed, May 21, 2003 at 04:46:53PM -0600, Dirk Elzinga wrote:
>> Hey there.
>>
>> Here is the Language Code. The goal of the code is to provide a quick
>> typological profile of a language.
>
> Nice :-) Let me try this on Ebisedian and see what I get ...
>
> [snip]
>> -=[ The Language Code ]=-
>>
>> T type
>> f fictional
>> l logical
>> x auxiliary
>> p personal
>> n natural
>
> Tf
>
>> P phonology
>> t tonal
>> c contour tones
>> r register
>> # number of tones
>> l level tones
>> ! downstep/downdrift
>> # number of tones
>> p phonemes
>> +/- allophony
>> # consonant phonemes
>> # vowel phonemes
>> s syllable template {c,v}
>
> Ptl2p30,36scv(c)
>
> Hmm, this doesn't seem adequate to express what happens in Ebisedian...
> although there are only two phonemic pitches, they are variously
> realized
> as low, high, low rising, high falling, or even rising-falling.
That's okay; this attribute/value pair is meant to indicate the number
of distinct tonemes; what happens to them later can be part of the +/-
allophony value for phonemes. BTW, I'm still not satisfied with the
syllable template value. I'd like to be able to indicate whether
complex onsets and codas are allowed and whether vowels can be long or
short (usually a property of syllable structure, or at least tied up
with syllable structure). Simply listing Cs and Vs with parens doesn't
quite do it for me. Ideas?
Looking at your figure for vowels reminds me of another problem.
There's obviously some vowel features which cross-cut an otherwise
smaller system (I believe I remember you saying that Ebisedian has 9
basic vowels which can then be lengthened, nasalized, etc). It is
probably these features which are important rather than their
aggregations as segments, but I don't know how to work that out without
making the code really cumbersome. If I have remembered the Ebisedian
vowel situation correctly, I would probably be inclined to give the
vowel number as 9. But that's just me; your figure is perfectly
legitimate and gives a better reflection of the surface variety.
>> M morphology
>> a agglutinating (+/-)
>> i isolating (+/-)
>> f inflecting (+/-)
>> h head-marking (+/-)
>> d dependent-marking (+/-)
>> t# number of distinct tenses
>> a# number of distinct aspects
>> m# number of distinct moods
>> t/a# number of distinct tense/aspect combinations (where a
>> meaningful distinction between tense and aspect cannot
>> be
>> made) (also t/m, a/m, etc)
>> c# number of distinct cases
>
> Ma+i-f++h-d+t0a/m9c9
>
> How about number of noun numbers and genders?
Okay; so we'd add:
g# gender/noun class
n# number distinctions
Or would we want to make explicit the kinds of number distinctions made
-- like this:
n number
s singular
d dual
t trial
p plural
f paucal ("just a Few things")
a distributive ("things All over the place")
g collective ("things Grouped together")
...
That would make English
M ... nsp
and Shoshoni
M ... nsdp
and Tepa
M ... nsfal
etc.
>> S syntax
>> b basic word order {v,s,o} (may substitute dots when the
>> terms s = 'subject' and o = 'object' are not
>> meaningful)
>> arg argument alignment
>> n nominative/accusative
>> e ergative/absolutive
>> a active/stative
>> h hierarchical
>> t topic/focus
>> s split/mixed system
>
> Hmm, how to indicate free word order?
We could change the "." convention to indicate free word order as well.
So Ebisedian would be Sb... Workable? Or is there a better solution?
> Also, I have NO idea how to describe Ebisedian's case system within
> this
> framework... based on previous discussion, it might make sense to have
> a
> 'semantic alignment' category?
I confess that I don't usually follow case/argument structure threads.
Do you mean that Ebisedian's alignment system is based purely on
semantic roles? If so, we could add it as "r" for "semantic Role". If
it is a completely different system altogether, how about "o" for
"other"?
>> L lexicon
>> c compounding/incorporation (+/-)
>> d derivation (+/-)
>> # number of words so far
> [snip]
>
> Lc++d++#461
>
> Maybe we need a corpus field as well? (Remember the LeGraTeC ratings?)
I don't. And I just Googled "LeGraTeC" and didn't come up with any
hits. What would a corpus field encode -- the number of texts, or the
total number of words in the corpus?
Thanks for the comments.
Dirk
--
Dirk Elzinga
Dirk_Elzinga@byu.edu
"I believe that phonology is superior to music. It is more variable and
its pecuniary possibilities are far greater." - Erik Satie
Reply