Theiling Online    Sitemap    Conlang Mailing List HQ   

Re: The mysterious substitution of question marks (was Re: Elves and Ill Bethisad)

From:Mark J. Reed <markjreed@...>
Date:Tuesday, October 21, 2003, 13:53
On Mon, Oct 20, 2003 at 11:20:15PM -0400, Tristan McLeay wrote:
> Are headers allowed to be in non-ASCII?
No.
> If they are, how can you tell what charset they're in before you've read them?
Exactly the reason they're not allowed to be non-ASCII. (And you can't go by the Content-Type header because the spec allows headers to appear in any order, so you may not have seen the Content-Type header by the time you're processing one of the others). However, there is a special form that lets you encode non-ASCII text in ASCII for use in headers. It looks like this: =?charset?encoding?text?= Where "encoding" is an abbreviated form of the name of one of the two common Content-Transfer-Encoding values: Q for quoted-printable, B for base64. And "text" is not allowed to have spaces, so you get the whole =?...?= bit for every word. For instance, when I sent out a message with the subject of "¿Puedes oír los tambores, Fernando?", what actually got transmitted was this: Subject: =?iso-8859-1?Q?=BFPuedes?= =?iso-8859-1?Q?o=EDr?= los tambores, Fernando? But if all the software on both ends is functioning properly, humans never see that. :) -Mark