Re: OT: Question: Unicode
From: | Carlos Thompson <chlewey@...> |
Date: | Sunday, May 18, 2003, 6:42 |
Roger Mills wrote:
> I've created a web page using MS Word, and Lucida Sans Unicode. In the
> header, MS says "charset-MS 1252" or somesuch. Should this be changed to
> UTF8?
Well, you should say UTF-8 if the text file is in UTF format, that is, if
you will give entities above ASCII with variable length codes (those that
look like ë for an á). You should use MS 1252, or better: ISO-8859-1, if
you plan to use Latin-1 codes (as in this e-mail) and html numeric entities
(those codes that look like "”") for Unicode values over 255. You
might use either or plain ASCII if you want to give html numeric entities
for
Simply, write an a acute (á). open your file with notepad or any other text
editor of that kind. Look how that a-acute looks in the text editor:
if it is an a acute, then the file is in Latin-1: using ISO-8859-1 or MS
1252.
if it is an A tilde followed by somethig else, the file is using UTF-8.
if it says á or á or á then your editor is making a plain
ASCII file.
Now, if your HTML editor can convert between formats, then:
ASCII files with HTML numeric entities are more portable and less browser
dependant. Anyhow, most probably those browsers that will accept an HTML
numeric entity for an IPA extention in Lucida Sans Unicode and show it
correctly, will support a different encoding.
Latin-1 (ISO-8859-1) is the ideal is you are written in Euroepan Western
languages mostly. You can still use HTML numeric entities for non Latin-1
character.
Microsoft Latin-1 (MS codepage 1252)... well, include some nice characters
in the 128-159 unused section of ISO Latin-1, like opening and closing
quotation marks and the Euro sign, but... they are still available in
Unicode above 255.
UTF-8: makes shorter files if you are using lots of codes not available in
Latin-1 or any other ISO-8859 code page. The UTF-8 files are difficult to
edit in common text editors (vi, pico, notepad, wordpad, etc) but if you
will never touch the HTML file with a text editor, you should not worried.
-- Carlos Th
Replies