Theiling Online    Sitemap    Conlang Mailing List HQ   

Re: OT: TECH: Dumb Unicode question

From:John Cowan <cowan@...>
Date:Friday, November 21, 2003, 21:08
Mark J. Reed scripsit:

> There really is a UTF-9, eh? I was recently contemplating the creation of > such a beast for use in a ternary system (the -9 would have > referred to trits rather than bits).
Not officially blessed by the Unicode Consortium, no. But documented and in use among the 36-bit computing community.
> And I'd expect Chinese/Japanese/Korean Unicode applications to opt for > SCSU (or BOCU-1, which I'm not familiar with?) in lieu of UTF-16, since > they'd then be guaranteed to do at least somewhat better than 2 bytes > per character, possibly much better with good use of the windows.
Korean uses ASCII space characters, so SCSU wins a little, and Japanese uses lots of katakana and hiragana, so SCSU wins a lot. I doubt if SCSU wins in Chinese. BOCU-1 is documented at http://www.unicode.org/notes/tn6. It is MIME-legal and provides excellent compression that does not depend on the decoder, but is not ASCII-compatible. -- John Cowan <jcowan@...> http://www.ccil.org/~cowan http://www.reutershealth.com Charles li reis, nostre emperesdre magnes, Set anz totz pleinz ad ested in Espagnes.