Theiling Online    Sitemap    Conlang Mailing List HQ   

Automated translation

From:Herman Miller <hmiller@...>
Date:Saturday, March 13, 1999, 2:20
Does anyone here have any familiarity with writing software to do =
automated
translation of languages? I recently bought the so-called "Universal
Translator" program, and I've been thinking that it can't be too hard to =
do
better than its minimal-quality translation. It doesn't even make genders
of adjectives agree with nouns, or even attempt to do any kind of
morphology. (Esperanto translations come out as bare roots without an
ending!) It appears to be doing simple dictionary lookup, with some
pre-processing to determine the parts of speech of the words, and it
doesn't even have very good dictionaries. I don't expect Systran quality
from such a low-priced product, but I expected better than what I got.

My first impression is "I could do better than that", and I started
thinking about writing a translator program that combines dictionary =
lookup
with some minimal morphology and syntax. Of course, I'd use my own
languages to test it with, since I'm more familiar with the correct =
usages
than with other languages. I'm not looking for correct parsing of natural
languages, but something a little more sophisticated than just looking up
words in a dictionary; it should at least be able to recognize "saw" =
could
be the past tense of "see", not just a present-tense verb that means "cut
with a saw", but it doesn't need to figure out from context whether "I =
saw
the wood" means "I cut the piece of wood with a saw" or "I did see the
forest". It should at least be able to generate Esperanto sentences with
the correct endings on the words.

My intuition is that producing a grammatically correct sentence from an
internal representation of the meaning is probably going to be easier =
than
parsing the input sentence to figure out the original meaning. Is this a
reasonable assumption? The next thing I need to figure out is how far I
want to go beyond a crude dictionary lookup program. Clearly, I don't =
have
the resources to do anything really sophisticated, but at least I should =
be
able to handle agreement of gender and number and a few basic things like
that. Maybe even conjugation of regular verbs. Has anyone done any
experimentation in this area, either with their conlangs or natlangs?

--
languages of Kolagia---> =
+---<http://www.io.com/~hmiller/languages.html>---
      Thryomanes        /"If all Printers were determin'd not to print =
any
   (Herman Miller)     / thing till they were sure it would offend no =
body,
   moc.oi @ rellimh <-/  there would be very little printed." -Ben =
Franklin