Re: Too long words
From: | Philippe Caquant <herodote92@...> |
Date: | Thursday, February 19, 2004, 13:09 |
Use a compression algorithm to make them shorter.
For instance, suppose that, in a corpus written in
your language, you often find the sequence *layaconi*
(whatever it might mean), then lexicalize it under the
form *n'gwa* - admitting that you hadn't such forms
yet.
So, instead of *layaconiyäinang*, you will have
*n'gwayäinang*, which sounds already much more cute.
The only criterion for compressing or not a sequence
would be the optimalizing of the text length. The
longer and the more frequent the sequence, the
faster you should lexicalize it. This can all be done
automatically.
(Don't forget to keep a track for the correspondences
in a lexicon).
--- Carsten Becker <post@...> wrote:
>
> My main problem is that due to the lenght and the
> too many informations of
> the word - as said above - constructing and parsing
> verbs is very difficult,
> at least in my opinion. But it's not only the length
> of the words, it's
> rather the ambiguity because it's not clear where a
> morpheme ends and where
> a new one begins.
=====
Philippe Caquant
"Le langage est source de malentendus."
(Antoine de Saint-Exupery)
__________________________________
Do you Yahoo!?
Yahoo! Mail SpamGuard - Read only the mail you want.
http://antispam.yahoo.com/tools