Theiling Online    Sitemap    Conlang Mailing List HQ   

Re: Um...help with unicode?

From:John Cowan <jcowan@...>
Date:Monday, November 4, 2002, 3:51
Mat McVeagh scripsit:

> Encoding systems are originally based around character sets (i.e. new > encoding systems were devised to cover character sets that previous ones > didn't). But with Unicode the idea is to have one single encoding system > that covers all character sets - laudable.
If by "character set" you mean "set of characters", then yes. But "character set" has a different meaning in the jargon, almost synonymous with what you call an encoding system. The usual term for "set of characters" is "character repertoire". I point this out not out of pedantry, but to help avoid confusion.
> Secondly, suppose you plump for Unicode as I now am doing. (I am planning to > be writing in languages with lots of different accents, IPA, and it would be > nice to do e.g. Greek. I don't want to have to switch between encoding > systems or character sets. I don't really know how to.) That doesn't mean > you can just type or read everything. Oh no. You have to have special fonts > installed. All the old fonts are useless.
Not at all. TrueType, OpenType, and AAT fonts all work fine; they have tables mapping Unicode code points to glyphs. Bitmapped fonts won't work.
> And, seemingly, there are not fonts yet for all areas of Unicode.
There are no single fonts (because of size limitations) that cover all of Unicode. There are several fonts that cover large stretches of it. There are many smaller fonts that can be used cooperatively (e.g. by Mozilla or Internet Exploder) as if they were a single font.
> Next... you get Unicode up, you've got the fonts installed... now how do you > type the characters? You need a special 'keyboard'. I.e. a protocol for > interpreting keystrokes on what physical keyboard you have as characters.
This is totally operating-system dependent. In general, you can get Microsoft keyboard maps from MS and Apple ones from Apple. More complex writing systems require "input methods", which are specialized programs for doing input, because they have too many characters to map onto a tolerable physical keyboard.
> OK. You have Unicode, relevant fonts to display your chosen character sets > with, relevant keyboards to type the characters with nice and easy. Now... > where do you type them? Any old where? NO! You cannot do Unicode at all with > Notepad.
Notepad for NT 4.0, Win2K, and WinXP definitely does operate on Unicode plaintext files.
> Let's suppose you have found a way to compose neatly in Unicode, and can do > textfiles, word-processed documents, webpages, typing in different fonts and > character sets in the same piece, and hence can mix ordinary text with your > accented conlang and phonetic transcriptions. Now... will your browser show > it properly? Will it handle Unicode properly? Or all the relevant fonts?
Version 6 and up browsers definitely will.
> And will your readers, to whom you have sent your masterpiece, or who are > browsing your site?
It depends on what fonts they have.
> Can anyone help???
Not to blow my own horn or anything, but I recommend my presentation at http://www.ccil.org/~cowan/uamb.ppt. -- Even a refrigerator can conform to the XML John Cowan Infoset, as long as it has a door sticker jcowan@reutershealth.com saying "No information items inside". http://www.reutershealth.com --Eve Maler http://www.ccil.org/~cowan

Reply

Peter Clark <peter-clark@...>