Theiling Online    Sitemap    Conlang Mailing List HQ   

Re: Online Language Identifier

From:Andreas Johansson <andjo@...>
Date:Tuesday, August 30, 2005, 10:42
Quoting "David J. Peterson" <dedalvs@...>:

> Radiohead will be pleased to know that Xerox is at it again! For > those of you who don't check Langmaker.com every two hours, > a resource was just posted about an online language identifier. > It can be found here: > > http://www.xrce.xerox.com/competencies/content-analysis/tools/guesser- > ISO-8859-1.en.html > > Basically it identifies the language that you put into the text > field (a sentence of five words or more). It was reviewed on the > blog Tenser Said the Tensor. The author put in Klingon, Quenya > and Sindarin. Klingon apparently was fairly consistently identified > as Maltese.
Some of mine: Tairezazh: Four different sentences resulted in two guesses of Hungarian, one of Latvian, and one of Turkish. Well, I do not know what I should have expected Meghean: Two sentences resulted in Romanian and Latin. Had sort-of expected Irish ... Kalini Sapak: Two sentences yield Slovakian and Indonesian. Given that the language is supposed to be Arabic without the pharyngeals et sim., this was unexpected. Altaii: One sentence returned a guess of Basque. I can see why, tho the language overall is more similar to Spanish. Andreas