Theiling Online    Sitemap    Conlang Mailing List HQ   

Re: Phonetics

From:John Vertical <johnvertical@...>
Date:Wednesday, March 28, 2007, 20:03
>Unicode specifically defines abstract characters, not glyphs; it's >data, not presentation. Roman, Cyrillic, and Greek uppercase A may all >look alike, but they are in fact different pieces of data. Their >actual appearance is technically irrelevant (although of course the >appearance of glyphs was involved at some level in the decisions made >about whether and how to include characters).
Agreed, agreed...
>A character is not the same as a code point, however. Even though the >sequence U+0061 LATIN SMALL LETTER A followed by U+0304 COMBINING >MACRON is distinct codewise from the above, compliant Unicode software >is required to treat them as representing the same "character".
And they _are_ supposed to do so, aren't they?
>But that semantics is defined in the standard -
Hold on - what semantics exactly? (Nevertheless, semantics isn't something that rigorous standards can apply on... at least with non-technical vocabulary.)
>U+0101 only exists for round-trip compatibility with other >character sets where that sequence exists as a single character. >It's not based on appearance. If anything, the cause/effect >relationship works the other way: the appearance should be the >same because the underlying abstract character is the same >(although many implementations fail to handle combining >characters appropriately, so the appearance is not the same).
This is the part where I disagree. If your abstract caracters are not based on their encoding, nor on the actual appearence, then what are they based on? I mean, I don't see a reason to postulate a metaphysical intermediate level here.
> > It completely depends on how do you define "alloglyph". "Same Unicode > > entity" would be circularish logic, and dependant on the font anyway. > >No, it's not. Unicode has nothing to do with fonts!
It definitely depends on the font used what exact sort of a glyph a giv'n piece of code comes out as, so either you misunderstood my point or I have misunderstood yours...
>So I think it's pretty clear that UNICODE most definitely >distinguishes between cedilla and comma below.
That it does.
>I thought the question before us concerned the historical >validity of making that distinction, not the fact that it is made >(enforced politically, even).
>Mark J. Reed
And I got the impression that Philip was suggesting that cedilla and comma belo' are "the same diacritic" in some manner independant of their encoding, appearence or history... John Vertical _________________________________________________________________ Windows Live Messenger - kivuttoman viestinnän puolestapuhuja. http://www.communicationevolved.com/fi-fi/

Reply

Mark J. Reed <markjreed@...>