Re: Unsupervised learning of natural languages
From: | tomhchappell <tomhchappell@...> |
Date: | Thursday, November 3, 2005, 19:51 |
--- In conlang@yahoogroups.com, Gary Shannon <fiziwig@Y...> wrote:
>
> --- Henrik Theiling <theiling@A...> wrote:
>
> <snip>
>
> >
> > But the algorithm is limited to finding context free
> > rules, so some
> > things like vowel harmony or Werner's law etc.
> > cannot be found. On
> > the syntax level, the same holds for word ordering
> > phenomena occuring
> > in German or Dutch. In the computational
> > linguistics fields, context
> > free grammars are insufficient for virtually
> > everything. So although
> > the algorithms are fun to play with, they are not
> > really innovative, I
> > think, for linguistics.
> >
> > **Henrik
> >
>
> My reading of the paper (see, in particular, Mode A
> and Mode B. in the supporting text where the alogrithm
> is detailed) indicates that the algorithm handles the
> extraction of both context free and context sensitive
> rules.
>
> Quote: "In particular, when ADIOS is iterated, symbols
> that may have been initially very far apart are
> allowed to exert influence on each other, enabling it
> to capture long-range syntactic dependencies such as
> agreement (Fig. 2F)."
>
> Inidcating that even long-range context is taken into
> account.
>
> --gary
>
Right. In Mode B, the string is treated as a new node only in
contexts where MEX decides it is "significant"; or, two nodes are
treated as equivalent only in contexts where MEX decides they are
both "significant". So, in Mode B, the algorithm can be Context
Sensitive.
Also, the Context Window can be set wider. In their test runs, it
was always 5 or less -- I guess that would be as if all their rewrite
rules would have had, at maximum, the form
ABCDE --> FGHIJ
where each of those letters could be any arbitrary
grammar symbol (or absent) -- a terminal or a non-terminal -- and not
necessarily distinct from each other.
But the Context Window Width was a parameter that could be reset at
run-time. If it was set for 7 or more, I think it might start
finding fairly "long-distance" dependencies.
Tom H.C. in MI
Reply