Re: Symbols in HTML, was: Boreanesian in the Web

From:Jörg Rhiemeier <joerg.rhiemeier@...>
Date:Thursday, October 25, 2001, 21:45
I wrote:

> > Has anybody experimented with graphic images for letters used inside the > > usual text? > > It works, though it doesn't look good because you can't predict the > fonts people use in their browsers. In most cases, the special letters > come out in the wrong font and wrong size (sometimes even the wrong > colour). > But it is the only way to display non-standard special characters that > works with just about any browser.
H. S. Teoh replied:
> Well, I still think the best way is to typeset each line of conlang text > (or some combination thereof) into a graphic file and use that instead. > Perhaps typeset individual words into graphic files if you have a way to > automate this (don't try this by hand unless you really have nothing > better to do) -- then you can let the browser line-wrap the text for you, > and it also reduces (hopefully) the amount of graphics that must be > downloaded by the browser.
This gives good results (and you even have control over the font), though graphics always increase the amount of data to be transferred from the server, and can be slow to load. I wrote further:
> The best way to present a conlang on the Web still is a transcription > scheme that confines itself to the ISO Latin-1 characters. If > everything else fails, use a convention to prefix or suffix diacritics > HTML cannot handle (e.g. ,s for s-cedille or ^g for > g-accent-circonflexe). Doesn't look very good, but works.
Andreas Johansson replied:
> As long as you don't need more than one diacritic'd (diacriticked?) > version > of every character, you can simply realize any diacritic as an underline. > To > underline a character, place it within the <u>-tag, so a-underline is > <u>a</u> (should your email programme interpret this as HTML, it's > "less-than u more-than"). This solution is pretty widespread on the 'net, > found for instance on, and, while ugly, much less so than > commas and circumflexes floating around, IMHO (personally, I find > word-medial commas hugely distracting).
Very true; word-medial commas and the like are quite icky and may cause ambiguities with actual interpunctation, and it is not obvious whether sich free-floating diacritics belong to the preceding or the following letter. Underlining is indeed a very convenient trick (I used it on a page about Proto-Indo-European for syllablic consonants, for example). There is also the <s> (strike-out) tag which places a horizontal line *through* the enclosed characters, which might prove useful as well. (I am not sure, though, whether it is supported by every browser.) Another option is of course to use digits and other non-alphabetic characters as letters (as most IPA-ASCII schemes do), but while characters such as `3' or `$' have at least SOME kind of letter-like quality, they lack the distinction between capital and small letters and look somewhat alien in the middle of a word, and things such as SAMPA's infamous `{' just look sick. To summarize, there is no optimal solution, but several less-than-perfect hacks that all have their pros and cons. Jörg.