Theiling Online    Sitemap    Conlang Mailing List HQ   

Re: Unicode vs The Rest Of The World (Again) (was Re: Re: Le tilde a-t-il été utilisé en français?)

From:Mark J. Reed <markjreed@...>
Date:Friday, April 30, 2004, 20:28
On Fri, Apr 30, 2004 at 04:46:48PM -0400, Paul Bennett wrote:
> More worrying than the mere adoption rate is that the List Server itself > is **severely** broken when it comes to UTF-8 (and presumably any other > full 8-bit encoding). It takes byte values (inside message bodies, I don't > know about inside attachments) 128 thru 149 and subtracts 128 from them, > leaving you with multi-byte UTF sequences that at best point to the wrong > character and at worst form a broken character that is unprintable.
Which would be bad enough if it were just the more typical "8-bit characters get munged; you must use 7-bit encoding methods" problem. But that's not the case. The mail server understands the various MIME 8-to-7-bit encoding techniques, reverses them, and *then* does the replacement anyway just as if the message arrived in 8-bit mode. However, it doesn't appear to include UTF-7 in its repertoire, so if your mail client can send UTF-7-encoded Unicode, others with UTF-7-capable Unicode mail readers can read your message. Unfortunately, UTF-7 is not an official UTF; it's not really supported by anyone or required by any standard, so finding products that understand it is haphazard at best. -Mark

Replies

Paul Bennett <paul-bennett@...>
Garth Wallace <gwalla@...>Unicode vs The Rest Of The World (Again) (was Re: Re: Le tilde a-t-il été utilisé en français?)