TECH: RFC 1345 (was Re: TECH: Testing again, no new on-topic content (was Re: "Language Creation" in your conlang))

From:Paul Bennett <paul-bennett@...>
Date:Monday, November 17, 2003, 22:48
On Mon, 17 Nov 2003 16:41:22 -0500, Paul Bennett <paul-bennett@...>

> Latin-2 seems ( to be close to > ideal for my new forthcoming language
In that page, reference is made to RFC 1345, which defines one-or-more-byte sequences -- using bytes 33-126 only -- to cover a wide portion of Unicode (Latin, Geek, Cyrillic, Hebrew, Arabic, Japanese, Chinese, plus some other symbols). I.e., it provides a lookup table between printable ASCII multibyte sequences and Unicode code points. Also, it shows a format for defining 8-bit character set mappings to those multibyte sets. Notably, it includes a mechanism for dealing with sequences containing 8-bit combining characters (letter plus diacritic sequences). Thus, it creates a two-way mapping between Unicode and the 8-bit format of your choice, via configuration tables made up of printable ASCII characters. My newest wish is that applications would openly and easily import and export character sets in RFC 1345 format, so that I could create my own 8- bit encoding (or we could create a pan-Conlang-L 8-bit encoding?) that could be imported into the email clients and browers of interested parties. John? Anyone else? Do you know of a way this can be done? Paul


