Re: Unknown Language Identifier!
From: | daniel andreasson <daniel.andreasson@...> |
Date: | Monday, January 29, 2001, 23:29 |
Làtae Jörg Rhiemeier:
> Quenya (Namaarie):
>
> Swahili 0.0273
> SchwytzZurich 0.0235
> Somali 0.0228
> RomanschSursilvan 0.0217
>
> Quenya (Genesis 2):
>
> Breton 0.0974
> Welsh 0.0649
> BasqueGuipaz 0.0487
> SerboCroatian 0.0473
Very strange results, indeed. I suspected that it might
have had something to do with the orthography so I
exchanged y for j and c for k. I also removed the dots
over the e's. I also tried a version with double vowels
instead of acute accents. These are the results (using
the exact same sample texts):
GENESIS 2
No dots over e; exchanged y for j and c for k.
Breton 0.1110
Estonian 0.0783
SerboCroatian 0.0716
Welsh 0.0572
No e-dots; used j and k ; double vowels instead of accents:
Breton 0.1072
Estonian 0.0855
SerboCroatian 0.0794
Finnish 0.0645
Comments: Still very high for Breton, but Estonian as second
alternative. More similar to Tolkien's intentions, imho.
NAMÁRIË
No dots over e; exchanged y for j and c for k.
SchwytzZurich 0.0329
SchwytzBern 0.0291
Polish 0.0279
German 0.0279
No e-dots; used j and k ; double vowels instead of accents:
SchwytzZurich 0.0366
Polish 0.0325
German 0.0297
Swahili 0.0288
Comments: Now this is just weird. Or Helge did something terribly
wrong writing the text. ;) None of the figures are that high either
so the program seems to have been having problems. This might be
due to the short sample text. Genesis 2 is much longer.
daniel
--
<> Daeselaidh goddi mis giall! <> daniel.andreasson@telia.com <>
<> Lwodadh giall! <> Daniel Andreasson <>