Re: TECH: UTF-8 and schcompile

From:	Henrik Theiling <theiling@...>
Date:	Monday, April 24, 2006, 1:07

|< < Post > >| << List/Tree >> Reference April 2006 Index

Hi!

Paul Bennett <paul-bennett@...> writes:
> Apparently, it ain't as simple as all that.
>
> I tried it with a UTF-8 sch file.
>
> The error was:
> Error: The first line should read something like '#!...sch...'
>
> so I erased the UTF-8 BOM at the start of the file.
I suspect I could skip the BOM, but the idea is that #! is at the beginning
of the file, of course, just like scripts under Unix.

> I also edited the "open" line to:
> open (F, '<:utf8', "$file") or error "While trying to read '$file': $!";
My idea was it should work in most cases when read in normal
UTF8-unware 8bit mode, provided both .sch and input files are read
this way.  UTF8 is unambiguous in matching, so any multichar phoneme
you define should just match a multichar sequence in the input string.
The only problem would be if you used a single . to match one
character -- this would not work, since it would match parts of UTF8
sequences.

Maybe I'll try to quick-fix the BOM problem tomorrow if I find the time.

**Henrik

|< < Post > >| << List/Tree >> Reference April 2006 Index

Reply

Paul Bennett <paul-bennett@...>