Conlang: Re: Phone frequencies (Alex Fink, Sep 7 '08, 2:17)

Re: Phone frequencies

From:	Alex Fink <000024@...>
Date:	Sunday, September 7, 2008, 2:17

From:

Alex Fink <000024@...>

Date:

Sunday, September 7, 2008, 2:17

On Sat, 6 Sep 2008 17:46:14 -0400, Logan Kearsley <chronosurfer@...> wrote:

>I used to have an IPA table that included the frequency of each phone >among world languages- which phones occur in 90% of all languages, >which phones occur in 80% of languages, which phones occur in only 5% >of languages, etc. But I seem to have lost it, and I can't find >anything like that on line. Anybody know where I could get a table or >a list with frequencies for different phones among world languages?

Wouldn't you know it, I was _just_ looking for the very same thing. UPSID (the UCLA Phonological Segment Inventory Database) does nearly exactly this, and there's an interface to it at http://web.phonetik.uni-frankfurt.de/upsid.html . Use "find certain sounds and languages that have them", option #5; it gives you a table with frequencies of each phone in the phonologies in its database below the output. Not sorted, but you can do that. Below I excerpt from an offlist message on the Glossotechnia discussion about this. [excerpt begin] For consonants it's got the irritating feature that dentals and alveolars and unspecified dental/alveolars are all counted separately, though. I've corrected for that by taking the unspecified counts and multiplying those by 14/5, and discarding the other two sorts -- this is indefensibly hacky, when I could've done the summation, but it was quick. That gives the following top of the frequency list (warning, monospace table ahead): n .9935 g .5610 k_h .2284 dz) .1240 t .9436 N .5255 p_h .2239 G .1220 m .9424 ? .4789 r* .2234 c .1197 k .8936 tS) .4169 v .2106 B .1197 l .8445 S .4146 x .2084 q .1153 j .8381 f .3991 4 .1613 tS)_h .1131 s .8381 r .3167 ts)_h .1551 b_< .1086 p .8315 J .3126 t_> .1490 mb) .1064 w .7361 t_h .3041 K .1490 ts)_> .1056 b .6364 ts) .2794 k_> .1397 nd) .1056 h .6186 z .2669 Z .1353 d .5650 dZ) .2506 k_w .1330 and no other sounds in more than one language in ten. r* was glossed in the list as "voiced dental/alveolar r-sound", whatever we make of that. For vowels the parallel irritation is that e.g. /e/ and /E/ and indifferent /e/~/E/ are counted separately; I've corrected (slightly less undefensibly) by dropping the indifferents and multiplying the others by 11/7, but special-cased /@/ and left it alone. This gives i .8714 I .1641 a_": .0754 a_" .8692 U .1463 e: .0732 u .8182 1 .1353 e~ .0627 E .6481 E~ .1219 O .5645 O~ .1116 o .4565 o~ .0941 e .4320 M .0909 a_"~.1840 i: .0887 i~ .1818 & .0865 @ .1685 o: .0836 u~ .1641 u: .0798 and no other sounds in more than one language in sixteen. My /a_"/ was just written /a/ in the UPSID but they called it unambiguously a low central vowel (this is a hole in the IPA more than anything). [exceprt end] Alex

Replies

Philip Newton <philip.newton@...>
Logan Kearsley <chronosurfer@...>