THEORY: Parsing for meaning.
From: | Gary Shannon <fiziwig@...> |
Date: | Sunday, June 25, 2006, 22:48 |
Looking back over an old conlang project called
SOALOA, ( http://fiziwig.com/soaloa/soaloa.html ) it
occured to me that the biggest obstacle to proper
machine translation is extracting the real meaning of
a sentence to be translated. As far as I know machine
translation programs don't try to deal with "meaning",
only with structure and dictionary replacements. If
the meaning of a target sentence could be properly
extracted and encoded then writing a decent sentence
generator for any given language, based on basic
standardized sentence patterns, would be relatively
easy.
But how to encode the information conveyed by a
sentence? Taking a hint from SOALOA I tried to reduce
any sentence, regardless of complexity, to a sequence
of simple SVO sentences, each optionally beginning
with a "linking word", which taken together encode the
complete literal meaning (if not the literary nuances)
of a sentence. Combining that idea with another of my
old projects to build an automated parser (
http://www.fiziwig.com/parser/parse1.html ) I thought
it might be possible to iteratively deconstruct a
sentence into a paraphrase in the form of a sequence
of [L]SVO sentences by simple pattern matching.
At each step a portion of the sentence is matched,
replaced by the [L]SVO output sentence, and then
removed from the original sentence leaving a simpler
sentence to be further decomposed by the next
iteration.
Thus: We are watching the antics of this funny little
monkey.
Is paraphrased:
We watch this: (SVO)
That monkey performs antics. (LSVO)
Same monkey is funny. (LSVO)
Same monkey is little. (LSVO)
These four sentences capture and encode in a standard
format the complete meaning of the sentence.
The pattern-matching steps would be (roughly):
We are watching the antics of this funny --little
monkey--. (Pattern adj+noun)
=> Same monkey is little.
We are watching the antics of this --funny monkey--.
(Pattern adj+noun)
=> Same monkey is funny.
We are watching the --antics of this monkey--.
(Idiomatic pattern)
=> That monkey performs antics.
--We are watching--
=> We watch this:
Another example:
Mercury bound his winged sandals to his feet, and took
his wand in his hand.
Mercury caused this: (SVO)
That sandals are_bound_to feet. (LSVO)
Same feet belong_to Mercury. (LSVO)
Same sandals belong_to Mercury. (LSVO)
Same sandals have wings. (LSVO)
Then Mercury caused this: (LSVO)
That wand be_in hand. (LSVO)
Same hand belongs_to Mercury. (LSVO)
Same wand belongs_to Mercury. (LSVO)
The pattern-matching steps are (roughly):
Mercury bound his --winged sandals-- to his feet, and
took his wand in his hand.
Same sandals have wings.
Mercury bound --his sandals-- to his feet, and took
his wand in his hand.
Same sandals belong_to Mercury.
--Mercury bound sandals-- to his feet, and took his
wand in his hand.
Mercury caused this:
sandals bound to --his feet--, and took his wand in
his hand.
Feet belong_to Mercury.
--sandals bound to feet--, and took his wand in his
hand.
That sandals be_bound_to feet.
and took --his wand-- in --his hand--.
Wand belongs_to Mercury.
Hand belongs_to Mercury.
--and took wand in hand--.
Then Mercury caused this:
--wand in hand--.
That wand be_in hand.
Thoughts?
--gary
Reply