|From:||Philip Newton <philip.newton@...>|
|Date:||Tuesday, March 27, 2007, 7:38|
On 3/23/07, Benct Philip Jonsson <conlang@...> wrote:
> * Ḑḑ U+1E10 U+1E11 LATIN ... LETTER D WITH CEDILLA
> * Ģģ U+0122 U+0123 LATIN ... LETTER G WITH CEDILLA
> * Ķķ U+0136 U+0137 LATIN ... LETTER K WITH CEDILLA
> * Ļļ U+013B U+013C LATIN ... LETTER L WITH CEDILLA
> * Ņņ U+0145 U+0146 LATIN ... LETTER N WITH CEDILLA
> * Șș U+0218 U+0219 LATIN ... LETTER S WITH COMMA BELOW
> * Țț U+021A U+021B LATIN ... LETTER T WITH COMMA BELOW
> Unicode doesn't distinguish very clearly between cedilla and
> comma below. The canonical shape used in Latvian and
> Rumanian is comma, while the Turkish is cedilla. The
> confusing names are a holdover from a time when one thought
> the Turkish and Romanian forms could be considered variants
> of one another.
Why shouldn't they be? Are they not in complementary distribution (one
in Turkish, the other in Latvian and Romanian), for starters?
And besides, the actual shape of cedilla can vary... I've seen an
Albanian write the ç in her name so that it looked like a lower-case c
with an inverted hacek (or a circumflex) below, for example.
At any rate, I've always taken the variants with comma below and with
cedilla below to be a glyph issue: alloglyphs of the same diacritic...
a bit like the apostrophe-after vs. caron-above issue with letters Dd
and Tt (cf. Ďď, Ťť).
Can you say why you think they cannot be considered glyph variants of
the same abstract diacritic?
I've yet to see a convincing argument on the issue; the most I've seen
is "that looks wrong" or "the shape is unacceptable" (which sounds to
me as if they have the wrong font, but not like an argument for
separating the two).
Compare also ó, where the accent "should" have a different slope
depending on whether you're writing Spanish or Polish (see
http://www.twardoch.com/download/polishhowto/kreska.html , for
example, where "acute" and "kreska" are constrasted).
Is this case not similar? The same abstract character with two
different, language-specific, glyph realisations?
Or would you disunify Polish kreska from acute the way you (appear to
want to) disunify s-with-comma-below from s-with-cedilla?
Or are the two cases not parallel?
I don't want to sound attacking; I (think I)'m honestly trying to
understand why it is not acceptable to treat "comma below" and
"cedilla" as glyph variants, the way it used to be in Unicode earlier.
Philip Newton <philip.newton@...>