Re: Unknown Language Identifier!
From: | J Matthew Pearson <pearson@...> |
Date: | Monday, January 29, 2001, 18:55 |
Patrick Dunn wrote:
> On Mon, 29 Jan 2001, John Cowan wrote:
>
> > > Also the fact that the orthography can muck things up so easily
> > > is disappointing (though not surprising).
> >
> > An orthography is a standard part of a written language. When
> > we want to identify text in English, we expect it to be written
> > using English orthography, not some random orthography.
>
> The problem is, of course, transliteration. As far as I know, there are
> still two or three different ways to transliterate Hebrew into the Roman
> alphabet (although maybe one's standard by now -- all the Hebrew I know I
> learned from books published in the 50s).
There's also the problem of languages which lack a standardised orthography
altogether, because they're rarely if ever written down.
Matt.