Re: OT: TECH: Dumb Unicode question
From: | John Cowan <cowan@...> |
Date: | Friday, November 21, 2003, 21:08 |
Mark J. Reed scripsit:
> There really is a UTF-9, eh? I was recently contemplating the creation of
> such a beast for use in a ternary system (the -9 would have
> referred to trits rather than bits).
Not officially blessed by the Unicode Consortium, no. But documented and
in use among the 36-bit computing community.
> And I'd expect Chinese/Japanese/Korean Unicode applications to opt for
> SCSU (or BOCU-1, which I'm not familiar with?) in lieu of UTF-16, since
> they'd then be guaranteed to do at least somewhat better than 2 bytes
> per character, possibly much better with good use of the windows.
Korean uses ASCII space characters, so SCSU wins a little, and Japanese
uses lots of katakana and hiragana, so SCSU wins a lot. I doubt if SCSU
wins in Chinese.
BOCU-1 is documented at http://www.unicode.org/notes/tn6. It is MIME-legal
and provides excellent compression that does not depend on the decoder,
but is not ASCII-compatible.
--
John Cowan <jcowan@...>
http://www.ccil.org/~cowan http://www.reutershealth.com
Charles li reis, nostre emperesdre magnes,
Set anz totz pleinz ad ested in Espagnes.