Theiling Online    Sitemap    Conlang Mailing List HQ   

Re: OT: CXS chart and machine-readable Unicode->CXS mappings

From:Henrik Theiling <theiling@...>
Date:Tuesday, March 9, 2004, 17:48
Hi!

Mark wrote:
> I've wrapped a module around the Perl so that you can simply do > > use CXS; >...
Oh, nice! I incorporated the code into my conversion script and renamed the resulting file to CXS.pm . By this, the page will always contain the newest bug-fixed/otherwise updated version as a Perl module. I also made the C-code more usable by wrapping a 'module' around it (or what C thinks a module is). Your skript has one problem, though, but that is due to the data actually: the hash table cannot be easily reversed because some Unicodes are mapped to the same CXS. This is mainly due to my inclusion of the modifier letters *and* the combining version of the accents. The combining versions should be preferred, unless the skipt sees a diacritic without something to attach to, in which case the isolated form should be returned. This is tricky. An easier way would be to ignore the modifier letters if there is a combining version. I will try to fix that by providing the module with more information about which entries should be considered primary. Bye, Henrik