Theiling Online    Sitemap    Conlang Mailing List HQ   

MNCH (was: magic natlang corpus harvesting)

From:Emily Zilch <emily0@...>
Date:Thursday, May 27, 2004, 17:35
{ 20040527,0304 | Danny Wier }

"I got 2.78 million hits for Arabic /la:/ 'no' (a ligature) with pretty
high precision. For Hindi, there are 40,700 pages with /hai/ 'he/she/it
is', but there may be some other Devanagari-script languages involved."

The ligature LA+ALIF is used with great frequency in Farsi. In fact, I
bet it appears in every Arabic-alifba-using natlang.

Of course, there may be a qualitative difference in the encoding since
Farsi et al. use a different handwriting style, the so-called KUFIC or
"horizontal" script, while Arabic(s) and African natlang alifba
borrowers use the "vertical" script, but this may appear in coding
simply as a font choice.



Mark P. Line <mark@...>
Danny Wier <dawiertx@...>