Re: Word length as a function of word frequency
From: | JS Bangs <jaspax@...> |
Date: | Friday, May 30, 2003, 3:46 |
Jeffrey Henning sikyal:
> I thought I had read a web page addressing word length as a function of
> word frequency before, but after a half-hour of searching Google I gave
> up and did a quick analysis of this English corpus in Excel:
>
http://www.comp.lancs.ac.uk/ucrel/bncfreq/lists/2_3_writtenspoken.txt
>
> Length of word - Average frequency of words with this length
> 1 - 1835.5
> 2 - 1790.7
> 3 - 900.2
> 4 - 211.3
> 5 - 110.7
> 6 - 78.6
> 7 - 71.9
> 8 - 63.1
> 9 - 59.5
> 10 - 53.6
> 11 - 49.9
> 12 - 47.1
> 13 - 48.7
> 14 - 36.4
> 15 - 33.0
> 16 - 30.0
Fun!
> I haven't scrubbed the corpus (and it looks like it could use it), but
> this quick and dirty analysis was all I needed for my conlanging
> activities of the moment, and proved my hypothesis correct. The more
> frequent words in my conlang should be shorter than less frequent words,
> but frequency declines more gradually than I anticipated for words of 7
> or more letters.
My immediate thought is that spelling is a poor predictor of phonetic
length in English. E.g. 'straight': 8 letters, one syllable, vs. 'area', 4
letters, 3 syllables.
And I see you've addressed this in the next paragraph:
> I had toyed with converting the words to phonetic representations but
> decided it wasn't worth my time. Obviously, the number of phonemes in a
> word is a stronger function of word frequency than the length of the
> English spelling of the word, but I didn't feel like using SOUNDEX or
> Zompist.com's English spelling algorithm (56 rules! --
>
http://www.zompist.com/spell.html) to come up with approximations of the
> phonetic length.
>
> Anyone inspired to do a more statistically thorough analysis?
Not at the moment ;). But do post your results if you ever get around to
this.
Jesse S. Bangs jaspax@u.washington.edu
http://students.washington.edu/jaspax/
http://students.washington.edu/jaspax/blog
Jesus asked them, "Who do you say that I am?"
And they answered, "You are the eschatological manifestation of the ground
of our being, the kerygma in which we find the ultimate meaning of our
interpersonal relationship."
And Jesus said, "What?"