Re: Lexicon counting (was: Weekly Vocab #1.1.1...)
From: | Carsten Becker <carbeck@...> |
Date: | Tuesday, September 5, 2006, 11:23 |
On Mon, 4 Sep 2006 19:30:30 -0500, Iain E. Davis <feaelin@...>
wrote:
>I hope everyone forgives me for replying to both of you at once. ;)
No, it's quite common here.
>----------------------> Carsten Becker wrote:
>> I'm counting my entries like this: I have a database that is
>> Ayeri -> English at first hand (it's reversible, but then you
>
>What software are you using for your database? I use Excel as a flat file
>"database" and then use a macro to 'generate' a word document in dictionary
>style, if I desire. Which is rare, I prefer to use the spreadsheet for the
>advanced filtering, searching, sorting, etc.
At the moment, I'm using a text file based solution so that each record has
its own .txt file. A PHP script then reads in all those an makes a listing
out of the single files. Of course, it's not possible to reverse or to
search this system. That's the reason why I am currently (and still)
migrating my dictionary into an SQL database that I can query with MySQL
and make human-readable with PHP.
>> Where my German-English dictionary would list all those
>> entries just under "to put", my database makes a new record
>> out of all of these (unfortunately).
>
>It probably would. But your English->German dictionary wouldn't, so you
>have to make some sacrifices somewhere. :).
No, you misunderstood I think. I referred to the English->German dictionary
already, but it's a double volume containing both, English-to-German and
German-to-English. All the constructions using "put" would be listed under
put, but I don't know whether all those combinations are counted as
separate entries.
>> handful. Futhermore, since Ayeri is an agglutinative
>> language, it has lots of suffixes -- these are also counted
>> as words, even the ones that only have a syntactical meaning.
>
>We differ here...as I mentioned to Henrik, I don't list any suffixed forms.
>There are some exceptions where some affixes completely change the meaning,
>but for the most part, it is only the 'original' form. :)
Just to clarify: I only list all the affixes, but not all possible
root+affix combinations. So you've got things like
-ang -- suffix, AGENT case marker
-aris -- suffix, PATIENT case marker
-ing -- adverb, so ...
-iya -- pronoun 3sg, he
>> If you removed those from the list, you'd still have
>> something around 1300 words, maybe a little more or less than that.
>
>Wow.
*feels honoured* ;) Well, if I included all possible affix combinations,
you could multiply that number by ten and triple the result or something,
so basically the same as Henrik said.
Carsten
Reply