Theiling Online    Sitemap    Conlang Mailing List HQ   

Complex script editor wish list

From:Herman Miller <hmiller@...>
Date:Friday, September 19, 2003, 0:39
I've been thinking that a configurable text editor with the ability to use
complex scripts would be a good thing to have. Pretty much all the scripts
I've come up with since I started doing my conlanging on the computer have
been fairly simple, but in the pencil and paper days, I had some fairly
complex scripts like the one used for Neesklaaz (as I spelled it in those
days; I've referred to it as Niskloz or Nieskloz more recently). Vowel
marks in the Neesklaaz script are written over the following consonant, and
more than one vowel mark can be written over the same consonant. Adjacent
consonants can combine to form ligatures. Vowel marks are often attached to
a specific point on a consonant, and multiple vowel marks on the same
consonant may need to be specially positioned or replaced with a ligature.

All those things are easy with pencil and paper, but I was limited to
transliteration when I put the Neesklaaz vocabulary on the computer. I did
have a crude printer font that I used with software I wrote to print a page
in graphics mode, but I rarely used it. And although there are now editors
that can handle specific complex scripts with specific languages (like
Hindi or Arabic), I don't know of any that's configurable to add new
scripts or languages.

So I've been thinking of writing a primitive text editor that can be
configured to handle complex scripts. (I mean réally primitive, on the
level of Notepad. Nothing you'd use for elaborate documentation, but good
enough for playing around with scripts.) I'm trying to get an idea of what
sorts of features would be nice to have in such an application.

Clearly, it has to be able to handle ligatures, diacritic placement, and
the kinds of contextual substitution that come up in complex scripts. The
Kazvarad script has a couple of letters with alternate forms that are used
when a long ascender would otherwise run into a nearby letter. Script
direction is also important; the Twing script historically used for
Nimoryikh is written right-to-left, and the Kazat ?Akkorou and Yortry
scripts are written vertically. It would be really nice to support mixed
script direction in the same document, but that brings up a whole new set
of problems, especially with vertical scripts.

Real-life scripts like Devanagari have complex reordering rules, where a
short i might need to be moved to the left side of a syllable at the same
time that an initial r- is moved to the end and replaced with a combining
mark above the final consonant. Then you've got scripts like Oriya and
Cambodian, where a single vowel character might need to be split up into
three parts before, above, and at the end of the syllable. So the text
display system at least needs to be able to find the boundaries of a
syllable and move characters around relative to those boundaries. (Probably
one reason the Thai script is encoded differently is that it's not always
possible to find the syllable boundaries without dictionary lookups.)

There should also be support for multiple languages using the same script.
For instance, the sequence U+E3CF U+E3CB U+E3E7 should be displayed as a
ligature in Olaetian, but left as separate characters in Azzian. Even in
the Latin script, there are characters that should be displayed differently
depending on language (like {ó}, which has an acute accent in languages
like Spanish, but a slightly different mark in Polish).

Anything else? I don't know if I'll ever get around to implementing áll of
these features, but it'd be nice to have a general idea of what script-
related features would be useful to have in an editor.

Replies

BP Jonsson <bpj@...>
Isidora Zamora <isidora@...>
JS Bangs <jaspax@...>