Theiling Online    Sitemap    Conlang Mailing List HQ   

Re: Tech: Unicode (was...)

From:John Cowan <cowan@...>
Date:Saturday, May 8, 2004, 13:32
Philippe Caquant scripsit:

> Looks like I missed something again. I had just > understood that Unicode used two bytes for encoding a > single character, which gives us 65,536 possibilities.
That's a simplified view of the true situation.
> So how can there be 90,000 Unicode characters ? Do you > mean that the same code can be equivalent to different > glyphs ? Or that some complementary system is used (a > 3rd byte ? a 4th byte ?) Or that some glyphs are just > different styles for the same character ? I'm very > confused again.
Note: all numbers in this posting are hexadecimal. Unicode assigns characters to integers in the range 0 to 10FFFF. The characters for all modern scripts are in the 16-bit range, 0 to FFFF. The range 10000 to 1FFFF is used for obsolete scripts, and the range 20000 to 3FFFF for obsolete, obscure, and non-standard Chinese characters. F0000-10FFFF is for private use only, and the rest is reserved and unassigned. There are several ways to encode Unicode. The most straightforward is just to use 32 bits per character, but this is expensive so it is rarely done. The UTF-16 representation represents the characters from 0 to FFFF with a single 16-bit word, and all other characters with two 16-bit words drawn from the reserved ranges D800-DBFF to encode the high-order bits and DC00-DFFF to encode the low order bits. The UTF-8 representation uses 1 byte for ASCII, 2 bytes for 80 to 7FF, 3 bytes for 800 to FFFF, and 4 bytes for everything else. The encoding is very cleverly done, but I omit details here.
> Too bad. I'll do better next time. I wonder how these > virtual keyboard looked like ? Why can't we find them > ? Only technical and cost problems, or other reasons ?
Macs support them today. A virtual keyboard looks like a drawing of a physical keyboard, and it's just a window. Clicking on the key drawings is equivalent to depressing and releasing the actual keys. As the keyboard mapping changes, so do the glyphs. You can probably get such an application for Windows, too. Of course the actual glyphs on the physical keyboard don't change: the mechanism is fragile enough without putting little screens in each one, not to mention the ruinous cost. However, virtual ink technologies (google for "virtual ink" for details) should make even that practical eventually. -- "And it was said that ever after, if any John Cowan man looked in that Stone, unless he had a jcowan@reutershealth.com great strength of will to turn it to other www.ccil.org/~cowan purpose, he saw only two aged hands withering www.reutershealth.com in flame." --"The Pyre of Denethor"

Reply

Philippe Caquant <herodote92@...>