Theiling Online    Sitemap    Conlang Mailing List HQ   

Re: Unsupervised learning of natural languages

From:Henrik Theiling <theiling@...>
Date:Wednesday, November 2, 2005, 13:57
Hi!

Sanghyeon Seo <sanxiyn@...> writes:
> I thought people on this list may be interested in the following paper: > > http://www.pnas.org/cgi/content/short/102/33/11629 > http://www.cs.tau.ac.il/~ruppin/pnas_adios.pdf/ > > Unsupervised learning of natural languages > Zach Solan, David Horn, Eytan Ruppin, and Shimon Edelman > > This inducts grammar rule from raw data (unsegmented writing, > continuous speech, etc.), and is also generative and predictive. The > algorithm is also believed to be linear, thus computationally > feasible.
Interesting, I will have to read that.
> Applying this to your conlang and generating few sentences may be an > interesting experience... If someone can implement this.
Indeed! :-) One phrase struck me as strange: '... It has been evaluated on artificial context-free grammars with thousands of rules, on natural languages as diverse as English and Chinese, ...' Hmm! These two languages may not be related, but both have relatively nice syntactic structures, i.e., a tree structure. So 'diverse' is a euphemism it that sentence. It would be much more interesting to see whether the approach works for, say, Dutch. If the algorithm tries to find context-free production rules, it will fail. Also, it would be interesting to see what it does for highly inflecting languages like Kalaallisut or Ancient Greek. If it fails here, too, the whole approach would not be too surprising at all, since one would naturally expect these things to fail. But, ok, these thoughts are premature -- I haven't read the article yet. **Henrik

Reply

Gary Shannon <fiziwig@...>