Theiling Online    Sitemap    Conlang Mailing List HQ   

Character sets

From:David G. Durand <dgd@...>
Date:Tuesday, January 22, 2002, 18:52
I saw a recent request for "official" commentary on character set
problems. I have never tried to censor people on this list, or to
mandate (as opposed to request) any form of discussion -- I see
myself as a facilitator, not a ruler. So there is no "policy" on
character sets.

However, the following facts are relevant:

Email originally started as a 7-bit ASCII creation. Later on, the
standards were amended to include new headers for other character
encodings to be used than US-ASCII. Many email readers ignore these
headers (leading to display failures). Some email programs set the
headers wrong, leading to display failures at the computers of people
whose mail reading programs are configured correctly. Sometimes, but
not always, senders and readers who both have buggy mail software
will have no problems because their character sets match up, and the
incorrect software at both ends doesn't get in the way.

Finally, even for people with compatible mailers, there may be other
problems, because conlang creates digests. These digests do not have
individual character encoding specifications for each message, so
even correctly sent messages may look corrupt on a correct client,
because of the loss of orignal message headers in digest creation.

In the end, postings in 8-bit character codes will certainly fail for
some readers, even if you stick to Latin-1.

This problem isn't going to be solved by everyone changing their
software. People don't like to change their mail programs, and
suggestions that they should do so, just irritate them. There are
many tradeoffs of computer platform interface, and character encoding
in such a choice, and one person's best solution may be another's
worst solution.

   -- David


Christophe Grandsire <christophe.grandsire@...>