Theiling Online    Sitemap    Conlang Mailing List HQ   

Re: Saprutum Script

From:John Cowan <cowan@...>
Date:Friday, May 11, 2001, 3:31
kam@CARROT.CLARA.NET scripsit:

> I'd be interested in staking out a bit of Unicode territory for these > characters, but just what is and isn't a character?? > > e.g. the combination "LI" (see p.3) could be treated as a "ligature" to > be coded separately, or as "L" plus a diacritic "I", or as two > separate characters with very close kerning. I assume there are > guidlines for this sort of thing.
In general, ligatures should not be encoded separately UNLESS both of the following apply: both the ligatured and unligatured forms are fairly common; there is a semantic distinction between ligatured and unligatured forms. If either the ligatured form, or the unligatured form, is fairly rare, then the Unicode character ZWJ can be inserted to create a ligature, or ZWNJ to prevent one. If the difference between the ligatured and unligatured forms is purely a matter of typographical style, and does not affect mere legibility, then the distinction can and should be left to markup rather than Unicode, which is a plain-text standard. An example of this is the oe-ligature in French, which can always be written as plain "oe" instead. Finally, if the rules for ligaturing are fully automatic, there is no need to represent the ligature; it can be left to smart rendering software. This is the case of Indic ligaturing. Unicode itself does not always follow these rules, due to the need for backward compatibility with existing character sets such as MacRoman. -- John Cowan One art/there is/no less/no more/All things/to do/with sparks/galore --Douglas Hofstadter


Lars Henrik Mathiesen <thorinn@...>