Theiling Online    Sitemap    Conlang Mailing List HQ   

Re: New listserv, better unicode? (was Re: META five posts a day limit)

From:Tim May <butsuri@...>
Date:Wednesday, June 29, 2005, 21:53
Stephen Mulraney wrote at 2005-06-29 21:26:29 (+0100)
 > On 6/29/05, taliesin the storyteller <taliesin-conlang@...>
 > wrote:
 > > * Stephen Mulraney said on 2005-06-29 21:22:06 +0200
 > > >
 > > > Incidently, I notice that the listsev page was a new interface,
 > > > and the interface seems to be a core part of the listserv
 > > > software itself (judging how the name of the interface is given
 > > > as just 'Listserv 14.4'). So presumeably the listserv software
 > > > has been given an upgrade - and my first though is 'is the
 > > > eating of certain unicode characters fixed?'. Can anyone recall
 > > > which ones were eaten? Ones encoded with 0xa0, something like
 > > > that?
 > >
 > > Certainly worth testing, see attached file.
 >
 >
 > Well, it works for me, at least in thunderbird. Viewing the
 > attachment fromwithin gmail in firefox, for me anyway, doesn't
 > work, though.
 >
 > But the question is: are any of those nice symbols ones that used
 > to fail?IIRC, many symbols came through alright (on a correctly
 > configured system),but a number did not.
 >
 > So, some further tests:
 > First of all, your test repeated, this time in the message body (I
 > don'tknow if that might change anything...). If it doesn't come
 > through, it'llshow (my, anyway) gmail isn't passing it properly
 > onto the listserv.
 >
 > Macrons:	ā ē ī ō ū
 > Hachek:      š ž ǧ ǰ ǎ
 > Various IPA: θ ð ɥ ʏ ɪ ɛ ʉ ɫ kʲ tʰ t͡b a˨ e˧ i˦ ↓ | ‖ ɓ ǃ ǀ ǂ ɑ
 > Other:	москжа さくら にっぽん ελλενικι    æøå þð ŋ む
 >
 >
 > Now.. Quoting Mark J. Reed from 7th of May '04:
 >
 > > The listserv software used on listserv.brown.edu, for whatever
 > > reason, strips the high bit off bytes in the decimal range
 > > 128-160, EVEN IF THEY ARE ENCODED AS QUOTED-PRINTABLE.  Or
 > > base64.  You send a message with, say, Cyrillic yeru, U+044B.  It
 > > is UTF-8 encoded and then QP-encoded, the result being =D1=8B.
 > > The listserv software turns it into =D1=0B, which is an illegal
 > > UTF-8 sequence, so the list recipients get gobbledygook.
 >
 > So, let's try some Cyrillic with a yeru: Язык  [ja-z-y-k].
 > Yitzik also mentioned somewhere that Georgian was mangled, so let'stry that:
 > ხ	4334	10EE	GEORGIAN LETTER XAN
 > ჯ	4335	10EF	GEORGIAN LETTER JHAN
 > ჰ	4336	10F0	GEORGIAN LETTER HAE
 > Now, let's see...

Both Georgian and Cyrillic above come through fine for me.  Let's see
if I can send... (If not, it's most probably an error at my end -
specifying an encoding involves a somewhat laborious workaround)

ქართული

ესე ამბავი სპარსული, ქართულად
ნათარგმანები,
ვით მარგალიტი ობოლი, ხელის-ხელ
საგოგმანები,
ვპოვე და ლექსად გარდავთქვი, საქმე
ვქმენ საჭოჭმანები,
ჩემგან ხელქმნელმან დამმართოს,
ლაღმან და ლამაზმა ნები.

Replies

Paul Bennett <paul-bennett@...>
Remi Villatel <maxilys@...>The future of (C-X-)SAMPA (was: New listserv, better unicode?)