Conlang: Re: Phonetics (John Vertical, Mar 28 '07, 20:03)

From:	John Vertical <johnvertical@...>
Date:	Wednesday, March 28, 2007, 20:03

>Unicode specifically defines abstract characters, not glyphs; it's >data, not presentation. Roman, Cyrillic, and Greek uppercase A may all >look alike, but they are in fact different pieces of data. Their >actual appearance is technically irrelevant (although of course the >appearance of glyphs was involved at some level in the decisions made >about whether and how to include characters).

Agreed, agreed...

>A character is not the same as a code point, however. Even though the >sequence U+0061 LATIN SMALL LETTER A followed by U+0304 COMBINING >MACRON is distinct codewise from the above, compliant Unicode software >is required to treat them as representing the same "character".

And they _are_ supposed to do so, aren't they?

>But that semantics is defined in the standard -

Hold on - what semantics exactly? (Nevertheless, semantics isn't something that rigorous standards can apply on... at least with non-technical vocabulary.)

>U+0101 only exists for round-trip compatibility with other >character sets where that sequence exists as a single character. >It's not based on appearance. If anything, the cause/effect >relationship works the other way: the appearance should be the >same because the underlying abstract character is the same >(although many implementations fail to handle combining >characters appropriately, so the appearance is not the same).

This is the part where I disagree. If your abstract caracters are not based on their encoding, nor on the actual appearence, then what are they based on? I mean, I don't see a reason to postulate a metaphysical intermediate level here.

> > It completely depends on how do you define "alloglyph". "Same Unicode > > entity" would be circularish logic, and dependant on the font anyway. > >No, it's not. Unicode has nothing to do with fonts!

It definitely depends on the font used what exact sort of a glyph a giv'n piece of code comes out as, so either you misunderstood my point or I have misunderstood yours...

>So I think it's pretty clear that UNICODE most definitely >distinguishes between cedilla and comma below.

That it does.

>I thought the question before us concerned the historical >validity of making that distinction, not the fact that it is made >(enforced politically, even).

>Mark J. Reed

And I got the impression that Philip was suggesting that cedilla and comma belo' are "the same diacritic" in some manner independant of their encoding, appearence or history... John Vertical _________________________________________________________________ Windows Live Messenger - kivuttoman viestinnän puolestapuhuja. http://www.communicationevolved.com/fi-fi/

Re: Phonetics

Reply