Theiling Online    Sitemap    Conlang Mailing List HQ   

Re: TECH: UTF-8 and schcompile

From:Benct Philip Jonsson <bpj@...>
Date:Monday, April 24, 2006, 13:41
Benct Philip Jonsson skrev:
> Paul Bennett skrev:
>> NOTE to Windows users: Notepad, Wordpad, and every standard Windows >> tool I have tried all fail to show the BOM or any sign of its >> existence. While this is technically correct behavior, it's not very >> helpful in this case. I used Cygwin VIM to "repair" the file, but >> there's a learning curve associated with it, and you have to carry >> out the repair steps at every iteration. > > > Can't you use a Perl s/// statement to remove the BOM? I would imagine > it is just a matter of knowing what it looks like in the encoding you > are using, and remove it if it appears at the beginning of the file? > See <http://www.unicode.org/faq/utf_bom.html#25> > >> How have y'all managed to produce and use UTF-8 sch files in Windows?
This Perl script does the trick: open(IN,'<:utf8', "$ARGV[0]"); $/ = undef; $thetext = <IN>; $/ = "\n"; close(IN); # You can read from and write to the same file, # if you are daring. $thetext=~s/^\x{feff}//; # Removes the BOM open(OUT, ">:utf8", "$ARGV[1]"); print OUT $thetext; close(OUT); -- /BP 8^)> -- Benct Philip Jonsson -- melroch at melroch dot se "Maybe" is a strange word. When mum or dad says it it means "yes", but when my big brothers say it it means "no"! (Philip Jonsson jr, age 7)