Re: OT: TECH: Dumb Unicode question
From: | Mark J. Reed <markjreed@...> |
Date: | Friday, November 21, 2003, 18:48 |
On Fri, Nov 21, 2003 at 01:13:04PM -0500, John Cowan wrote:
> /me takes a deep breath, inserts bit between teeth, and begins...
Thanks for the clarification!
> that allowed 1024^2 = 2^16 = 1,048,576 additional characters to be
> represented using two consecutive Plane 0 codepoints, one from each block.
> These 2^16 characters were mapped onto ISO 10646 planes 1 through 17.
You mean 2^20, not 2^16.
> one to three bytes for each Plane 0 character and four bytes for each Astral
> Plane [not an official term] character;
Heh.
I realize that modern computers deal most efficiently with 16-
and 32-bit quantities, but it still seems like there ought to be a
UTF-24, for external storage if nothing else. That extra byte per
character in UTF-32 is never ever needed for anything.
But I guess the idea is to always use UTF-8 or SCSU for external
storage, and UTF-16 or UTF-32 for in-memory processing.
-Mark
Reply