Re: THEORY: Parsing for meaning.
From: | Paul Bennett <paul-bennett@...> |
Date: | Monday, June 26, 2006, 13:38 |
Standard Reply-To Disclaimer Here. Replies should be on-list, please.
-----Original Message-----
>From: Yahya Abdal-Aziz <yahya@...>
>> --- Paul Bennett wrote, in reply to Eugene:
>>
>> > I actually started thinking about this principle
>> > around a year or so ago. I gave up when I
>> > couldn't figure out what the minimal atomic
>> > units of linguistic knowledge should be[*], but
>> > I did also envisage extending the system to
>> > allow it to attempt to determine cognates, and
>> > plausibly build a tree of relatedness given
>> > semantically identical [**] corpora in a set of
>> > languages. They seem to be based on a
>> > generalization of the same problem.
>> >
>> > [*]Heck, if I knew *that*, I could put Chmosky
>> > out of business... ;-)
>
>;-)
>
>[**] I have a problem with that. How do you
>determine that two segments in different
>languages are "semantically identical"?
>
By guessing. Assign random relationships, at low strength, each time you
encounter something new. Maybe add several sets of competing random
relationships. Subsequent encounters will either strengthen or weaken the
relationships, as will dialogue with a human operator. Remember that computers
can read gigabytes of corpus entries in the time a human could read kilobytes.
Throw it the complete works of a few hundred modern writers, with naturalistic
translations into a second language, and let it make of it what it will. If it
takes a year of continually being fed and processing new examples, that'll
still be in the range of the human equivalent.
I started teaching myself Korean using a computer instruction manual with
English and Korean text. I didn't get very far, but I don't have the processing
power or persistence of a large computer. The basic principle seems to hold,
though, if it can be implemented.
The notion comes (to me at least) mostly from a Douglas Hofstadter book. I forget
the exact title, but it has "Fluid Analogies" in the title. In it, he described
a similar algorithm for guessing the next number in a given sequence given a
system with very small and limited rules of mathematics.
Paul