Re: Phonetics
From: | John Vertical <johnvertical@...> |
Date: | Wednesday, March 28, 2007, 20:03 |
>Unicode specifically defines abstract characters, not glyphs; it's
>data, not presentation. Roman, Cyrillic, and Greek uppercase A may all
>look alike, but they are in fact different pieces of data. Their
>actual appearance is technically irrelevant (although of course the
>appearance of glyphs was involved at some level in the decisions made
>about whether and how to include characters).
Agreed, agreed...
>A character is not the same as a code point, however. Even though the
>sequence U+0061 LATIN SMALL LETTER A followed by U+0304 COMBINING
>MACRON is distinct codewise from the above, compliant Unicode software
>is required to treat them as representing the same "character".
And they _are_ supposed to do so, aren't they?
>But that semantics is defined in the standard -
Hold on - what semantics exactly?
(Nevertheless, semantics isn't something that rigorous standards can apply
on... at least with non-technical vocabulary.)
>U+0101 only exists for round-trip compatibility with other
>character sets where that sequence exists as a single character.
>It's not based on appearance. If anything, the cause/effect
>relationship works the other way: the appearance should be the
>same because the underlying abstract character is the same
>(although many implementations fail to handle combining
>characters appropriately, so the appearance is not the same).
This is the part where I disagree. If your abstract caracters are not based
on their encoding, nor on the actual appearence, then what are they based
on? I mean, I don't see a reason to postulate a metaphysical intermediate
level here.
> > It completely depends on how do you define "alloglyph". "Same Unicode
> > entity" would be circularish logic, and dependant on the font anyway.
>
>No, it's not. Unicode has nothing to do with fonts!
It definitely depends on the font used what exact sort of a glyph a giv'n
piece of code comes out as, so either you misunderstood my point or I have
misunderstood yours...
>So I think it's pretty clear that UNICODE most definitely
>distinguishes between cedilla and comma below.
That it does.
>I thought the question before us concerned the historical
>validity of making that distinction, not the fact that it is made
>(enforced politically, even).
>Mark J. Reed
And I got the impression that Philip was suggesting that cedilla and comma
belo' are "the same diacritic" in some manner independant of their encoding,
appearence or history...
John Vertical
_________________________________________________________________
Windows Live Messenger - kivuttoman viestinnän puolestapuhuja.
http://www.communicationevolved.com/fi-fi/
Reply