Re: About making a translator
|From:||Ray Brown <ray.brown@...>|
|Date:||Wednesday, October 27, 2004, 19:15|
On Wednesday, October 27, 2004, at 12:34 , H. S. Teoh wrote:
> On Wed, Oct 27, 2004 at 02:30:28AM +0400, Alexander Savenkov wrote:
>> 2004-10-26T16:06:04+03:00 Ray Brown <ray.brown@...> wrote:
>>> But, as Richard has written & I have discovered from experience, it
>>> is a highly non-trivial task.
>> According to what I've read, this is an impossible task for now.
>> Machine translation will be possible with the invention of AI.
> Impossible to be 100% correct, yes. But may be possible to do an
Yes, especially if the translation is of a text in some well defined
knowledge domain such as nuclear physics, pop music, the Harry Potter
stories, or whatever. But a general purpose translator is not possible at
> The essence of the problem is that natural language is inherently
> ambiguous, and requires (usually implicit) context to interpret
In fact it requires 'real world' knowledge - in other words a truly
immense knowledge base.
> Take for example the following quote, which I got from
> somebody on this list:
> Time flies like an arrow.
> Fruit flies like a banana.
It's a couplet I often quote. I forget the details, but "Time flies like
an arrow" was deliberately devised to test an early natural language
parser at, I believe, MIT (almost certainly written in LISP). From what I
remember, some rather liked the machines suggestion of a species of fly
know as "time flies" all going crazy over an arrow hence the second
> The second sentence is particularly pathological, in that it has two
> possible parses, both of which have sensible semantics:
> 1) Fruit-NP flies-V like-ADV (a banana)-NP
> 2) (Fruit flies)-NP like-V (a banana)-NP
The first sentences is actually even more ambiguous if one relies just on
a syntactic analysis, depending whether we take "time", "flies" or "like"
as the main verb, thus:
Taking "time" as the imperative of the verb "to time" and 'flies" as a
plural noun, being the direct object of the verb "time", we have:
1. Time flies in a similar way to that in which you would time an arrow.
2. Time flies in a similar way to that in which an arrow would time flies.
Ok K - the second meaning is particularly stupid and no human interpret it
that way. But it requires _semantic_ knowledge to realize that. On syntax
alone the meaning is possible. Both those parsings assume "like" is a
conjunction meaning "in a similar way that" (i.e. it means "as"). "like"
could be an adjective, defining those darn "flies", thus:
3. Time only those flies which have a similar shape to an arrow.
If we take "time" as a noun, being the subject of the verb "flies" (3rd
pers. singular of the present tense of "to fly") we get:
4. Time flies in an analogous way to the way an arrow flies.
So far we have been taking "like" either as a conjunction meaning "in a
similar way that" or "in an analogous way to" or as an adjective meaning
"similar (to)" ; but we could take "like" as a verb, being the third
person plural with subject "flies". "Time" is an epithet noun, that is we
have our species of fly known as "time flies:, thus:
5. Time-flies are just crazy about eating an arrow.
> The problem with this kind of ambiguity is that it is an inherent
> ambiguity in English grammar. (And it's not just English alone; I
> believe most, if not all, natlangs are inherently ambiguous.)
This type of ambiguity is typical of languages with little morphology and
great reliance on syntax, like English or modern Chinese. But ambiguity is
inherent in all natural languages, if only because on the semantic level
words often have wide ranges of meaning.
[snip - but all very true]
> applied to make guesses that are right 90% of the time. Approximate
> being the keyword here, however, because currently existing
> translators fall woefully short of the quality needed for general use.
> As someone once said, "Heuristics are buggy by definition, because if
> they weren't buggy, they'd be algorithms."
> Perhaps AI might help in
> improving this, but with the current state of AI, I'm not holding my
Nor I - my lungs ain't big enough :)
Anything is possible in the fabulous Celtic twilight,
which is not so much a twilight of the gods
as of the reason." [JRRT, "English and Welsh" ]