Re: Unicode vs The Rest Of The World (Again) (was Re: Re: Le tilde a-t-il été utilisé en français?)
From: | Mark J. Reed <markjreed@...> |
Date: | Friday, April 30, 2004, 20:28 |
On Fri, Apr 30, 2004 at 04:46:48PM -0400, Paul Bennett wrote:
> More worrying than the mere adoption rate is that the List Server itself
> is **severely** broken when it comes to UTF-8 (and presumably any other
> full 8-bit encoding). It takes byte values (inside message bodies, I don't
> know about inside attachments) 128 thru 149 and subtracts 128 from them,
> leaving you with multi-byte UTF sequences that at best point to the wrong
> character and at worst form a broken character that is unprintable.
Which would be bad enough if it were just the more typical "8-bit
characters get munged; you must use 7-bit encoding methods" problem.
But that's not the case. The mail server understands the various MIME
8-to-7-bit encoding techniques, reverses them, and *then* does
the replacement anyway just as if the message arrived in 8-bit mode.
However, it doesn't appear to include UTF-7 in its repertoire, so if
your mail client can send UTF-7-encoded Unicode, others with
UTF-7-capable Unicode mail readers can read your message.
Unfortunately, UTF-7 is not an official UTF; it's not really
supported by anyone or required by any standard, so finding products
that understand it is haphazard at best.
-Mark
Replies