|From:||Mark J. Reed <markjreed@...>|
|Date:||Monday, April 2, 2007, 14:43|
I think you are overcomplicating what is really a simple concept.
Forget Unicode. When English-speakers say the alphabet has 26 letters, what
are they counting? Not glyphs. Certainly not bit patterns. Those are
You are, I think, interpreting "glyph" too broadly. In Unicode terms, a
glyph is A SINGLE PARTICULAR GRAPHICAL REPRESENTATION OF A CHARACTER. There
are zillions of glyphs for Latin small letter a, but it is still one
character. If you put a macron over it, that's still one character, even
though there are different Unicode code points you could use to construct
At the bottom are the bits - that's the encoding.
Decode the bits and you get a sequence of Unicode code points (or "scalar
Parse the combining characters, surrogate characters, etc. and you get a
sequence of "absters".
If you're rendering the text visually, you next pick a font, rendering
algorithm, etc. and apply it to the absters, and you get glyphs.