TECH: RFC 1345 (was Re: TECH: Testing again, no new on-topic content (was Re: "Language Creation" in your conlang))
From: | Paul Bennett <paul-bennett@...> |
Date: | Monday, November 17, 2003, 22:48 |
On Mon, 17 Nov 2003 16:41:22 -0500, Paul Bennett <paul-bennett@...>
wrote:
In that page, reference is made to RFC 1345, which defines one-or-more-byte
sequences -- using bytes 33-126 only -- to cover a wide portion of Unicode
(Latin, Geek, Cyrillic, Hebrew, Arabic, Japanese, Chinese, plus some other
symbols). I.e., it provides a lookup table between printable ASCII
multibyte sequences and Unicode code points. Also, it shows a format for
defining 8-bit character set mappings to those multibyte sets. Notably, it
includes a mechanism for dealing with sequences containing 8-bit combining
characters (letter plus diacritic sequences).
Thus, it creates a two-way mapping between Unicode and the 8-bit format of
your choice, via configuration tables made up of printable ASCII
characters.
My newest wish is that applications would openly and easily import and
export character sets in RFC 1345 format, so that I could create my own 8-
bit encoding (or we could create a pan-Conlang-L 8-bit encoding?) that
could be imported into the email clients and browers of interested parties.
John? Anyone else? Do you know of a way this can be done?
Paul
Reply