Re: Tech: Unicode (was...)
From: | John Cowan <cowan@...> |
Date: | Saturday, May 8, 2004, 13:32 |
Philippe Caquant scripsit:
> Looks like I missed something again. I had just
> understood that Unicode used two bytes for encoding a
> single character, which gives us 65,536 possibilities.
That's a simplified view of the true situation.
> So how can there be 90,000 Unicode characters ? Do you
> mean that the same code can be equivalent to different
> glyphs ? Or that some complementary system is used (a
> 3rd byte ? a 4th byte ?) Or that some glyphs are just
> different styles for the same character ? I'm very
> confused again.
Note: all numbers in this posting are hexadecimal.
Unicode assigns characters to integers in the range 0 to 10FFFF.
The characters for all modern scripts are in the 16-bit range, 0 to FFFF.
The range 10000 to 1FFFF is used for obsolete scripts, and the range 20000
to 3FFFF for obsolete, obscure, and non-standard Chinese characters.
F0000-10FFFF is for private use only, and the rest is reserved and
unassigned.
There are several ways to encode Unicode. The most straightforward is
just to use 32 bits per character, but this is expensive so it is rarely
done. The UTF-16 representation represents the characters from 0 to FFFF
with a single 16-bit word, and all other characters with two 16-bit words
drawn from the reserved ranges D800-DBFF to encode the high-order bits
and DC00-DFFF to encode the low order bits. The UTF-8 representation
uses 1 byte for ASCII, 2 bytes for 80 to 7FF, 3 bytes for 800 to FFFF,
and 4 bytes for everything else. The encoding is very cleverly done,
but I omit details here.
> Too bad. I'll do better next time. I wonder how these
> virtual keyboard looked like ? Why can't we find them
> ? Only technical and cost problems, or other reasons ?
Macs support them today. A virtual keyboard looks like a drawing
of a physical keyboard, and it's just a window. Clicking on the key
drawings is equivalent to depressing and releasing the actual keys.
As the keyboard mapping changes, so do the glyphs.
You can probably get such an application for Windows, too.
Of course the actual glyphs on the physical keyboard don't change: the
mechanism is fragile enough without putting little screens in each one,
not to mention the ruinous cost. However, virtual ink technologies
(google for "virtual ink" for details) should make even that practical
eventually.
--
"And it was said that ever after, if any John Cowan
man looked in that Stone, unless he had a jcowan@reutershealth.com
great strength of will to turn it to other www.ccil.org/~cowan
purpose, he saw only two aged hands withering www.reutershealth.com
in flame." --"The Pyre of Denethor"
Reply