Theiling Online    Sitemap    Conlang Mailing List HQ   

Re: Complex script editor wish list

From:Jamie Heikkila <ownthenight@...>
Date:Saturday, September 20, 2003, 22:03
I think Unicode would work for such a system... I don't know how
difficult it would be to implement a system.

In case it isn't possible with Unicode.

Ideographs
Non left -> right scripts

-----Original Message-----
From: Herman Miller [mailto:hmiller@IO.COM]
Sent: September 18, 2003 8:42 PM
Subject: Complex script editor wish list

I've been thinking that a configurable text editor with the ability to
use
complex scripts would be a good thing to have. Pretty much all the
scripts
I've come up with since I started doing my conlanging on the computer
have
been fairly simple, but in the pencil and paper days, I had some fairly
complex scripts like the one used for Neesklaaz (as I spelled it in
those
days; I've referred to it as Niskloz or Nieskloz more recently). Vowel
marks in the Neesklaaz script are written over the following consonant,
and
more than one vowel mark can be written over the same consonant.
Adjacent
consonants can combine to form ligatures. Vowel marks are often attached
to
a specific point on a consonant, and multiple vowel marks on the same
consonant may need to be specially positioned or replaced with a
ligature.

All those things are easy with pencil and paper, but I was limited to
transliteration when I put the Neesklaaz vocabulary on the computer. I
did
have a crude printer font that I used with software I wrote to print a
page
in graphics mode, but I rarely used it. And although there are now
editors
that can handle specific complex scripts with specific languages (like
Hindi or Arabic), I don't know of any that's configurable to add new
scripts or languages.

So I've been thinking of writing a primitive text editor that can be
configured to handle complex scripts. (I mean réally primitive, on the
level of Notepad. Nothing you'd use for elaborate documentation, but
good
enough for playing around with scripts.) I'm trying to get an idea of
what
sorts of features would be nice to have in such an application.

Clearly, it has to be able to handle ligatures, diacritic placement, and
the kinds of contextual substitution that come up in complex scripts.
The
Kazvarad script has a couple of letters with alternate forms that are
used
when a long ascender would otherwise run into a nearby letter. Script
direction is also important; the Twing script historically used for
Nimoryikh is written right-to-left, and the Kazat ?Akkorou and Yortry
scripts are written vertically. It would be really nice to support mixed
script direction in the same document, but that brings up a whole new
set
of problems, especially with vertical scripts.

Real-life scripts like Devanagari have complex reordering rules, where a
short i might need to be moved to the left side of a syllable at the
same
time that an initial r- is moved to the end and replaced with a
combining
mark above the final consonant. Then you've got scripts like Oriya and
Cambodian, where a single vowel character might need to be split up into
three parts before, above, and at the end of the syllable. So the text
display system at least needs to be able to find the boundaries of a
syllable and move characters around relative to those boundaries.
(Probably
one reason the Thai script is encoded differently is that it's not
always
possible to find the syllable boundaries without dictionary lookups.)

There should also be support for multiple languages using the same
script.
For instance, the sequence U+E3CF U+E3CB U+E3E7 should be displayed as a
ligature in Olaetian, but left as separate characters in Azzian. Even in
the Latin script, there are characters that should be displayed
differently
depending on language (like {ó}, which has an acute accent in languages
like Spanish, but a slightly different mark in Polish).

Anything else? I don't know if I'll ever get around to implementing áll
of
these features, but it'd be nice to have a general idea of what script-
related features would be useful to have in an editor.