Theiling Online    Sitemap    Conlang Mailing List HQ   

Re: Language Recognition

From:H. S. Teoh <hsteoh@...>
Date:Friday, February 9, 2007, 1:27
On Thu, Feb 08, 2007 at 10:27:47PM +0100, Henrik Theiling wrote:
> Hi! > > This interesting page was posted on another list: > > http://en.wikipedia.org/wiki/Wikipedia:Language_recognition_chart
[...] Y'know, this is just begging for a conlang version of that page... :-) I'll start: 1) Ebisédian: Roman orthography (ASCII): Uses '3' and '0' as vowel letters, double-vowels to represent long vowels, e.g., _00_, _ee_, _aa_. Plenty of apostrophes in words indicating stress. Case-sensitivity like Klingon: _K_ and _k_ are different consonants. Roman orthography (LaTeX): uses ø, multiple accents on vowels (macron, acute), use of tear-drop accent (looks like left open single quote over the letter), subscript tilde. Native script (sanokí): many diacritics over and under glyphs, no spacing between glyphs, lines may break in the middle of the word (although this would be hard to apply without actually knowing the word/clause/paragraph-final glyphs---but perhaps by recognizing repeated sequences of symbols which break at different points). In both Roman orthographies, _q_, _x_, are not used. _r_ is relatively common. Common tell-tale words: _Ke_, _ve_, _ke_, _je_, _re_ (always at end of clause); _keve_, _t0m0_, _t3m3_, _tumu_, _tama_, _timi_. Common single-word clauses ending in -i or -ii. 2) Tatari Faran: Roman orthography: letters c,g,l,q,v,w,x,y,z not used. Use of apostrophe (') for glottal stop. Uses _ts_ as digraph. Only lowercase Latin letters are used, even in proper names and at the beginning of a sentence. _d_ only occurs word-initially and _r_ only occurs medially. Frequent occurrence of _a_. Native script: written vertically, top-to-bottom, then left-to-right, with diacritical marks on the left and right of the column. Letter forms tend to be flat. Common tell-tale words: _ka_, _kei_, _ko_, _sa_, _sei_, _so_, _na_, _nei_, _no_ (never at the beginning of the clause); _e_ (never at the end of a clause). Here's an idea for a conlang game: everyone submit info for their conlang like above and we collect it all in a central place somewhere (maybe FrathWiki?), then each one creates some sample text in their respective conlang(s) and submit it to a central repository where everything is shuffled and redistributed. Then each one has to guess which conlang the text is written in based on the collected identifying characteristics. For simplicity, maybe the first version of this game can be restricted to only those conlangs with Latin-like orthography, maybe including Cyrillic if there are enough of those to make it challenging. If it works out well, we can include exotic writing systems as well the next time round (although those will tend to be so distinctive that you'd be able to tell immediately). T -- The best compiler is between your ears. -- Michael Abrash

Reply

Philip Newton <philip.newton@...>