Theiling Online    Sitemap    Conlang Mailing List HQ   

Re: Tech: Unicode (was...)

From:Philippe Caquant <herodote92@...>
Date:Monday, May 3, 2004, 13:37
As far as I know, my mailer understands nothing at
all, and from earlier experiences I think it isn't
even able to send a French "e acute" that could be
read correctly by others, that's why I type without
accents. Let's try it:
This is a "e acute": é
and this, a "e grave": è

Well, I can read it in my window, but I bet half of
the conlangers, if not all of them, won't. What says
Yahoo ? "Your outgoing messages are currently encoded
with a US-ASCII character set.   We hope to add
foreign character support in the near future. ". Oh,
thank you so much, Mr Yahoo.

My idea was that if you sent special letters like
Ctrl-C-whatever, the code might be retranslated a
dozen times from the source to the origin, and you
will never know what happened in the meantime. For ex,
for what you sent (+BBAEEQQSBBM), I read +BBAEEQQSBBM,
and I haven't the faintest idea what it could mean,
neither where a character begins and where it ends and
how many letters are these. But if you send the codes
in this way:
1040,1041,1042,1043, or (in hex:):
0410,0411,0412,0413
or even:
410,411,412,413
then:
1/ you're sure you won't have any trouble with the
emailer or any other tools, because all of them
understand numbers from 0 to 9 (or to F), plus the
comma (or the blank)
2/ you will understand very easily that Hex-0410 is a
code in itself, and you will find easily the
correspondance if you have a table at hand (and there
is one in Word for XP, that's why I can tell you how
these characters look like)
3/ with a little habit, you might know a good deal of
usual codes by heart
4/ and all of this, without bothering about any UTF-7
or UTF-8 or UTF-7+ADs (???) in the world.

This is my usual way when I want to communicate about
an Ascii code: I don't bother what it looks like on my
screen or on paper or in my emailer, I just tell my
correspondant "it's Ascii Decimal 10" and he
understands it quite right, even if he happens to work
in octal on a XIX-th century Uzbek keyboard (which is
seldom the case, I must confess).

If there is a rule saying that in Unicode, Hex-413 is
Cyrillic 4th letter of the alphabet in capitals (Ge),
so just tell me it's Hex-413 and I'll be able to read
it, and redraw it (although it would be better if a
macro could do it for me). If there is no such rule,
then I can't see any interest for Unicode at all.

--- "Mark J. Reed" <markjreed@...> wrote:
> > If I want to type Cyrillic, I hit control-C and type > the Roman > transliteration: +ACI-Kuda idyot Ivan Ivanovich+ACI > +AD0APg +AKsEGgRDBDQEMA +BDgENARRBEI +BBgEMgQwBD0 > +BBgEMgQwBD0EPgQyBDgERwC7. Having to go through > Character Map or look-up/type in the > hex codes for every letter would drive me banana > nuts. > > +AD4 just now), you could send: > +AD4 1040,1041,1042,1043 > > I can't even begin to turn those into characters > without converting into > hex first (see above about decimal code points). > But yes, those are > U+-0410, U+-0411, U+-0412, and U+-0413, the > uppercase versions of the first > four letters of the Cyrillic alphabet: > > +BBAEEQQSBBM > > I'm sending this message in UTF-7+ADs if your mailer > understand UTF-7, then > you'll see all the Cyrillic stuff automatically. > What it actually looks > like underneath, and what those whose mailers don't > understand UTF-7 > see, is ugly but relatively compact. If your mailer > groks UTF-7, > the following is what the above four-letter sequence > looks to those > whose mailer doesn't. If your mailer doesn't > understand UTF-7, it > should look almost the same as the above except for > an extra minus sign, > which makes the difference between a literal plus > sign and the start of > an encoded sequence. > > +-BBAEEQQSBBM > > -Mark
===== Philippe Caquant "High thoughts must have high language." (Aristophanes, Frogs) __________________________________ Do you Yahoo!? Win a $20,000 Career Makeover at Yahoo! HotJobs http://hotjobs.sweepstakes.yahoo.com/careermakeover

Reply

Mark J. Reed <markjreed@...>