> Cool. What program are you using to measure the frequency of
> tokens? Does it measure frequency of phrases as well?
> You can get such a script (in Perl) from my site:
>
>
http://www.pobox.com/~jimhenry/conlang/frequencies.pl
>
> (I have a newer, better version than what is on my website,
> but I can't FTP-upload it from the hospital wireless network.
> I'll do that sometime after I get out. Meanwhile I could email
> it to you if you want it.)
>
> If you have something that will measure the frequency of
> wildcard phrases (e.g. how often two words occur with
> any word between them, or with any two words, or...)
> let me know.
Ideally you'd derive your statistics not from strings of wordforms but from
semanticosyntactic trees. Or both. E.g. you'd want to find the frequency of
"give X food" (which might warrant a compressed form meaning "feed X"),
regardless of the length of X.
I say "ideally" because it'd mean an awful lot of work, for results that would
be very interesting yet surely still distressingly distant from perfection.
--And.