Re: Word length as a function of word frequency
From: | Dirk Elzinga <dirk_elzinga@...> |
Date: | Monday, June 2, 2003, 22:06 |
On Saturday, May 31, 2003, at 10:33 AM, Sally Caves wrote:
> Dear Dirk,
>
> Can you explain your beautiful chart below in layman's terms? I'm
> unsure
> what a "segment" is as distinguished from a syllable, or how we are to
> read
> the chart.
A segment is a speech sound or phone. The chart is meant to show the
relationship between the number of segments in a word and the number of
syllables in a word. Looking at the row for 5-segment words, we see
that 431 of the total 19,528 words in the dictionary have a single
syllable, while 2343 of them have two syllables and 293 5-segment words
have 3 syllables. On the far right in the column labelled "total", you
can see that 3067 words in the dictionary have 5 segments.
The chart doesn't say anything about frequency, so it doesn't properly
belong in this thread. However, some conclusions about the relationship
between frequency and size might be obtainable from the same 20,000
word dictionary which provided the data for the chart I posted since
the dictionary also contains frequency information.
>> I did do a little project to see how segment count and syllable count
>> are related; it was inspired by a similar graph I saw for German in an
>> old article in _Language_. Here is my graph for the 20,000 word
>> dictionary (use a monowidth font to view the graph):
>>
>> Distribution of lexical items according to syllable (x-axis) and
>> segment (y-axis) count
>>
>> 1 2 3 4 5 6 7 8 total
>>
>> 17 1 2 1 4
>> 16 3 3 6
>> 15 2 7 4 13
>> 14 3 13 20 5 41
>> 13 12 57 40 8 117
>> 12 3 69 172 47 291
>> 11 29 242 256 33 560
>> 10 8 165 621 287 9 1090
>> 9 47 532 922 138 2 1641
>> 8 271 1248 721 24 2264
>> 7 1 936 1525 273 2735
>> 6 39 1922 937 40 2938
>> 5 431 2343 293 3067
>> 4 1480 1406 14 2900
>> 3 1514 127 1641
>> 2 215 215
>> 1 5 5
>>
>> total 3685 7060 4746 2903 949 162 22 1 19528
I also did some digging trying to find the source of the dictionary,
and came up with these two URLs:
http://www.lexicon.arizona.edu/~hammond/newdic: this is a searchable
version of the 19,528 item dictionary I used for the chart.
http://info-center.ccit.arizona.edu/~ling/newdic: this is the text file
itself.
I got "newdic" from Mike Hammond when I was a grad student in Arizona,
and he mentioned that it is a later version of a dictionary called
"phondic". I can't find the origin of "phondic".
Dirk
--
Dirk Elzinga
Dirk_Elzinga@byu.edu
"I believe that phonology is superior to music. It is more variable and
its pecuniary possibilities are far greater." - Erik Satie
Reply