Theiling Online    Sitemap    Conlang Mailing List HQ   

Re: Interlinears

From:Tristan McLeay <conlang@...>
Date:Thursday, January 5, 2006, 5:14
BEWARE! Unicode (UTF-8) characters lurk within!

Paul Bennett wrote:
> On Wed, 04 Jan 2006 21:01:19 -0500, Tristan McLeay > <conlang@...> wrote:
>> I don't know if it does better than anything else, though it claims >> to "support both simple and complex ruby markup"... >> >> http://piro.sakura.ne.jp/xul/_rubysupport.html.en > > > We're still talking about a browser-specific solution, which isn't ideal.
Well, if you're happy with a solution in Word or LaTeX, you could just as easily print off a well-rendered version of the webpage in PDF form, and say "here it is again in PDF if you can't see the formatting properly". This may in fact be your best bet, and if this is all you'd otherwise use LaTeX for I'd probably even recommend it.
> Where's all the outrage from SEA about this?
SEA?
>>> I know LaTeX can do it, but that's one heck of a learning curve >>> just for one feature, and AFAICT Unicode support is pretty spotty, >>> as is non-Unicode IPA support. >> >> >> LaTeX can actually do a much better job of Unicode stuff than people >> give it credit for. If you use the UTF-8 and inputenc packages (along >> with the right font packages), you can get it to do simple Unicode >> input (i.e. basic multilingual plane without combining characters); >> you can even motivate it to do combining characters which I've never >> tried. > > > Huh. Well, combining characters are kind of a big deal, for me.
Well, if your input is TeX-with-unicode, then you can do things like \v{ɪ} to get a haczek on top of an IPA small caps i. As I said, you can motivate it to do it even with the proper Unicode way---I think the cost is speed & memory use, because TeX wants to know the accent first, then the character (to get a good idea of where to put it), whereas Unicode follows handwriting in putting the character first, then the accent, so it means that it always has to read a few characters in advance of where it's typesetting. (There's also the fact that TeX-with-unicode is still fundamentally using an 8-bit character set, so that when it sees the Unicode for ɪ̌, it sees four (or more) characters, so it has to look a fair way in advance to know what it's about to do. This also means you can't do \'ɪ, you have to do \'{ɪ}, which is a bit more typing, but TeX's output of arbitrary accent-letter combinations is **so** much better than anything I've seen in normal environments. It can also do things like having letters as diacritics on top of other letters, which is a bit like ruby, but still different.)
>> In any case, having the output in Unicode is actually not completely >> possible because it uses characters that aren't encoded in Unicode >> for formatting stuff (things like the ffi ligature, Tengwar fonts, >> obviously exactly what depends on the set-up). > > > I thought ffilig was on the Prime Material Plane. Isn't it? I'd swear > I've seen it, right next to ffllig, in the Alphabetic Presentation > Forms. I *think* there's a place for Tengwar on Plane 1, too, as I > recall. There's quite a vocal elf lobby out there, using words like > "serious linguistic and metalinguistic research" and so on.
Hm, well you seem to be right about the ligatures (except it's in the BMP ;). I thought the Alphabetic Presentation Forms only had the fi, fl ligatures. Still, it doesn't have a ct ligature there. (Maybe in the Prime Material Plane, about which I know nothing.) My character map knows nothing about Tengwar, but that mightn't mean anything.
>> Using something like LyX (a "what you see is what you mean" editor) >> can also drastically reduce the learning curve---but you'll probably >> still need to learn something for the interlinear packages, because I >> doubt LyX would know how to format them. (The final formatting's done >> by LaTeX so it'll always come out the way it's meant to; it's the >> as-you-go formatting that I'm talking about here.) > > > If I get desperate enough, I'll try LyX again. ISTR trying it, and > failing to install it properly or something, and running away full of > fear and loathing. The plan, should it manifest itself, will be to > produce WYSIWYG docs, and read the "raw" document source to try and > help me understand some of the tutorials online.
Maybe LyX is only comprehensible to people who already know LaTeX, or who have someone who does at their disposal... :) Wouldn't surprise me, but I have started a few people with it! -- Tristan.

Reply

Paul Bennett <paul-bennett@...>