Re: Unknown Language Identifier!
From: | Patrick Dunn <tb0pwd1@...> |
Date: | Monday, January 29, 2001, 17:33 |
On Mon, 29 Jan 2001, John Cowan wrote:
> > Also the fact that the orthography can muck things up so easily
> > is disappointing (though not surprising).
>
> An orthography is a standard part of a written language. When
> we want to identify text in English, we expect it to be written
> using English orthography, not some random orthography.
The problem is, of course, transliteration. As far as I know, there are
still two or three different ways to transliterate Hebrew into the Roman
alphabet (although maybe one's standard by now -- all the Hebrew I know I
learned from books published in the 50s).
And then you have languages like Pali, in which several orthographies are
acceptable. YOu can write Pali in the Sanskrit alphabet, or in the Roman
alphabet, or in the Cyrillic alphabet, or --
So until it can support unicode, I doubt it'll be more than a fun
curiosity. But a very fun curiosity. Hrondu apparently looks a lot like
Swahili. Go figger.
---------------------------------------------------------------------
Living your life is a task so difficult,
it has never been attempted before.