Notes on UPSID phoneme inventory

From:Ed Heil <edheil@...>
Date:Saturday, July 3, 1999, 0:14
David Crystal's fun & interesting _Cambridge Encyclopedia of Language_
has a section which might be of interest to anyone who wonders how
"typical" language phonologies look.  For those of us without a
linguistics degree, and a wide exposure to the different permutations
and combinations of sounds in languages, this kind of question comes

This is a transcription of some notes I took on Crystal's report on
the UPSID language survey, a database of phoneme inventories from a
sample of 317 languages from a wide variety of language families and

UPSID language survey notes, shamelessly borrowed from David
_Cambridge Encyclopedia of Language_.

Mura (?) and Rotokas had the fewest segments: 11.
!Xu~ had the most: 141.
70% of languages have 20-37 segments.

Consonants and Vowels:
UPSID languages had from 6 to 95 consonants, averaging 22.8.
They had from 3 to 46 vowels, averaging 8.7.
Vowel to consonant ratio ranged from 0.065 to 1.308, averaging about

/p/ implies /k/ implies /t/.
/g/ implies /d/ implies /b/.
/m/ implies /n/.

A nasal at an articulatory location implies a stop at the same
A voiceless labial or approximant implies its voiced counterpart.
Mid vowels imply the existence of high and low vowels.

The 20 Most Common Phonemes:

p,b   t,d   tS   k,g   ?
 f     s     S
 m     n     n^  N         <--those last two are palatal n, and "eng"
 w    l,r    j

Most languages have 14-16 of these 20 phonemes.
No UPSID language had exactly this set.

Typical Phoneme Inventories:
The typical language has 5-11 stops, 1-4 fricatives, 2-4 nasals, and
4 others.

92% of languages had voiceless stops.
67% had voiced.
29% had aspirated stops.
16% had ejective stops.
11% had implosive stops.

UPSID languages ranged from 1 to 6 manners of stop articulation,
averaging 2.
They averaged 3 or 4 places of articulation, which included (for 99%
of languages)
bilabial, dental or alveolar, and velar.

63% of languages had /h/.
93% of languages had at least one fricative besides /h/.  Most had 4
or less.
A few had 12 or more.
The most common fricative besides /h/ was /s/ (in 83% of languages);
after that
were /S/ ("esh") and /f/, and then /z/, /x/, /v/, and /Z/ ("ezh") in
that order.
Only a third as many languages had /z/ as /s/.

97% of languages have a nasal.  /n/ is most common, followed by /m/.
/N/, and /n^/
in that order.  Less than 4% of nasals are voiceless.

96% of languages have /l/ or /r/ (or some close equivalent).
72% have more than one of those liquid/approximant.
83% of liquids are voiced.
87% are dental or alveolar.
97% of /r/-type consonants are voiced.
86% of /r/-like consonants are trills, taps, or flaps.

99% of glotallics are voiceless.
60% are stops.
k` is the most common one.
Languages with glottalics may have up to five of them.

97% of implosives are voiced.
b` is the most common one.


The vowels found in UPSID at a given location is as follows
(the first number is unrounded vowels, the second is rounded):

       FRONT      CENTER    BACK
HIGH   452/29     55/10     31/417
MID    425/32     100/8     19/448
LOW    81/0       392/1     13/36

94% of front vowels are unround.
93.5% of back vowels are round.
75% of low vowels are central.
69% of central vowels are low.

High front vowels are significantly (481 vs 448) more common
than high back vowels.

Less than 6% of languages had the minimum 3 vowels.
The largest vowel inventory was !Xu~, with 24.
German and Norwegian, with 15 each, had very large vowel

Most languages had 5-7 phonemic vowels.

There were only 83 phonemic diphthongs found in the 317
language sample, and 1/4 of them were found in !Xu~, which
also has a really monstrous consonant inventory that there's
no way I'm going to type in here.

Rotokas has this tiny consonant inventory:
p, t, k, B (voiced bilabial fricative), g,
D (voiced tap).

(notes by Ed Heil, 6/2/99)