Re: Vallian (was: How to minimize "words")
From: | Henrik Theiling <theiling@...> |
Date: | Monday, February 26, 2007, 12:02 |
Hi!
T. A. McLeay writes:
> On 26/02/07, Philip Newton <philip.newton@...> wrote:
> ...
> > *nods* and Unicode also caters for that -- as best I remember, it has
> > different "weights" for various kinds of diacritics specifying which
> > ones should be closer to the base character, so diacritics should
> > stack in the correct order regardless of which order you encode them
> > in (e.g. a + combining diaeresis + combining tilde should "do the
> > right thing", at least according to Unicode, though I don't remember
> > which order they think is the correct one in this case -- combinations
> > of letter + diacritic above + diacritic below or vice versa should
> > definitely work fine, though, if your rendering engine's up to it).
>
> No:
> a + combining diaeresis + combining tilde has the tilde atop the diaresis
> a + tilde + diaeresis has the diaresis atop the tilde.
> Who is Unicode to say you can't have a fronted nasalised vowel as
> easily as you can have a nasalised fronted one?
>...
Or Philip had thought of the canonical ordering algorithm that assigns
to each diacritic a value by which the diacritics must be sorted to
make a Unicode sequence canonical.
However, these numbers are for diacritic 'attachment points' that do
*not* interfere. Those that attach at the same point have the same
sort value and the order is significant (for stacking) and must not be
changed by the sorting algorithm.
E.g. for an 'a' with an acute above + dot below, the encoding order is
well-defined -- in this case, logically it does not matter whether you
add the acute above first or the dot below, as the diacritics do not
interfere. The algorithm simply defines how such a character should
be encoded so that e.g. comparison is easier.
**Henrik
Reply