Theiling Online    Sitemap    Conlang Mailing List HQ   

Re: OT: TECH: Dumb Unicode question

From:Mark J. Reed <markjreed@...>
Date:Friday, November 21, 2003, 18:48
On Fri, Nov 21, 2003 at 01:13:04PM -0500, John Cowan wrote:
> /me takes a deep breath, inserts bit between teeth, and begins...
Thanks for the clarification!
> that allowed 1024^2 = 2^16 = 1,048,576 additional characters to be > represented using two consecutive Plane 0 codepoints, one from each block. > These 2^16 characters were mapped onto ISO 10646 planes 1 through 17.
You mean 2^20, not 2^16.
> one to three bytes for each Plane 0 character and four bytes for each Astral > Plane [not an official term] character;
Heh. I realize that modern computers deal most efficiently with 16- and 32-bit quantities, but it still seems like there ought to be a UTF-24, for external storage if nothing else. That extra byte per character in UTF-32 is never ever needed for anything. But I guess the idea is to always use UTF-8 or SCSU for external storage, and UTF-16 or UTF-32 for in-memory processing. -Mark

Reply

John Cowan <cowan@...>