File converter (was Re: IE wordlist?)
From: | Paul Bennett <paul.bennett@...> |
Date: | Thursday, November 11, 1999, 11:51 |
Amanda Babcock>>>>>>
On Wed, 3 Nov 1999, Paul Bennett wrote:
Does anyone have this list in real ASCII? It says it's in ascii, but it's
^M's only, and so has no line feeds in unix.
<<<<<<
Slightly Off:
It's trivial to make a program that takes any file that's "nearly PC text" and
turn it into something more readable. I made one specifically to deal with this
file (in all of 5 minutes), but it'll work on many word-processor formats and
semi-binarified digests, as long as you don't mind hunting manually to seperate
the signal from the all the noise in such files. I can't promise it'll work
with accented text (in fact I can almost guarantee it wont), due to the variety
of codepages (and so forth) out there. I can post it seperately to any
interested parties, it's a PC-Exe that should work on everything from Granny's
8088 to the latest Quad-Athlon megabeast, as long as it's running an ms-dos-like
operating system (MS-DOS, PC-DOS, DR-DOS, Win9x, WinNT).
Of course, if we were all using a "real" OS (Unix) it'd be a one-liner...
This is *not* an Open Source project, more from shame over shoddy-looking code
(and the fact that it's in QuickBasic) than any sense of having done anything
unique <GGG>
Requests for changes will be considered, without any promise of including them.
While I'm on the subject, would anyone appreciate a program to "Romanise" a
Unicode UTF-8 file into 7-Bit PC-Ascii? I've been thinking about writing one,
but I need a stronger reason to start a project of this size than just my own
idle curiosity.
*************************************************************
This email and any files transmitted with it are confidential
and intended solely for the use of the individual or entity
to whom they are addressed.
If you have received this email in error please notify the
sender. This footnote also confirms that this email message
has been scanned for the presence of computer viruses.
*************************************************************