Theiling Online    Sitemap    Conlang Mailing List HQ   

Re: TECH: UTF-8 and schcompile

From:Henrik Theiling <theiling@...>
Date:Monday, April 24, 2006, 1:07
Hi!

Paul Bennett <paul-bennett@...> writes:
> Apparently, it ain't as simple as all that. > > I tried it with a UTF-8 sch file. > > The error was: > Error: The first line should read something like '#!...sch...' > > so I erased the UTF-8 BOM at the start of the file.
I suspect I could skip the BOM, but the idea is that #! is at the beginning of the file, of course, just like scripts under Unix.
> I also edited the "open" line to: > open (F, '<:utf8', "$file") or error "While trying to read '$file': $!";
My idea was it should work in most cases when read in normal UTF8-unware 8bit mode, provided both .sch and input files are read this way. UTF8 is unambiguous in matching, so any multichar phoneme you define should just match a multichar sequence in the input string. The only problem would be if you used a single . to match one character -- this would not work, since it would match parts of UTF8 sequences. Maybe I'll try to quick-fix the BOM problem tomorrow if I find the time. **Henrik

Reply

Paul Bennett <paul-bennett@...>