Re: Lexicon counting (was: Weekly Vocab #1.1.1...)
|From:||Iain E. Davis <feaelin@...>|
|Date:||Thursday, September 7, 2006, 1:38|
> I used Excel as well, but I'm using the Openoffice equivalent
> for the Unicode compatibility. Then I save it as a CSV file,Hmm. To what degree did Excel not do what you needed unicode wise?
I was able to come up with what I needed for IPA symbols, which was my main
concern. I was happy from there. :)
Not that I think you should change. If OpenOffice works successfully,
Pretty. I've not anything so well constructed, web-wise. :)
> But I just realized that I can't reverse the entries. I guess
> I should have a English dictionary, for instance, and link my
> Ausonian entries to the English entries? Ouch.
The method I used for achieve that effect doesn't produce a
English->Taraitola dictionary, or in your case an English->Ausonian one.
What I did is one of my columns (in addition to the _definition_ column) is
"English Word" column. This is the closest equivalent english word, if any.
Then for the English->Taraitola "Cross-Index" a entries looks like:
And my expectation is that you look up äm and
> >> haven't got the pronunciation and whatnot for the English
> words), so
> >> the main entry always an Ayeri word. The problem is that
> my database
> >> does not accept sub-entries, so every
> > I have the same issue, although I don't believe Taraitola has any
> > constructions like tapiao, so it is less of a concern.
> >> tapiao - to put; to set
> >> tapiao dayrin - to save ("to put aside")
> > ...[snipped]
> > Hmm. Since each of those have a distinct meaning, I'd argue that in
> > terms of counting, you should count them all. :)
> I would add it as a derivative meaning from the same verb.
> For instance:
> 1. to put; to set.
> 2. ~ <i>dayrin</i> - to put aside; to save.
> or something like that.
> >> Where my German-English dictionary would list all those
> entries just
> >> under "to put", my database makes a new record out of all of these
> >> (unfortunately).
> > It probably would. But your English->German dictionary
> wouldn't, so
> > you have to make some sacrifices somewhere. :). I have
> something of
> > the same problem on the _other_ end. There are words that have
> > distinctions that English doesn't make. So the 'english word'
> > column/field can potentially have apparent duplicates. It doesn't
> > matter too much to me, since the more important field is the
> > 'definition' field. English word is merely for creating a
> 'index' of
> > english-->Taraitola words (no meaning or adornment, just a
> pointer to the Taraitola word).
> Could you not be more specific in the entries? Or it is a
> word-to-word association?
> >> As for names, I keep them in an extra list, so they are
> not counted.
> > Common
> > Which is a reasonable separation. Arguably, when I generate the
> > dictionary, mine _are_ separated...into Appendix C: Famous
> People and
> > Places. In my data, though, the only real difference is
> that they're flagged as 'C'
> > entries (for appendix C) instead of 'A' entries.
> > We did similar things, just different approaches.
> Both are good ideas. I need to categorize the entries in this
> way... by semantical domain. Do anyone know a good list of
> classes for it?
> I would put the names on the main dictionary just for the
> etymology. Who were the bearers of the name is a matter for
> another publication : ). Do gods count on this restriction?
> >> expressions usually have their own entries as well. There are not
> >> many expressions listed in the dictionary, though, just a
> > I have very few expressions and in fact, they are all out of date
> > since I've never revisited them since I completely revised the
> > phonology. They are stored completely separately, but they
> may pre-date the spreadsheet...
> And these expressions are entries on the dictionary, or
> on the main word? for instance:
> 1. that which is said, fame.
> 2. ~ ambrotós - the immortal fame, used in poetry &c &c
> >> handful. Futhermore, since Ayeri is an agglutinative
> language, it has
> >> lots of suffixes -- these are also counted as words, even the ones
> >> that only have a syntactical meaning.
> > We differ here...as I mentioned to Henrik, I don't list any
> suffixed forms.
> > There are some exceptions where some affixes completely change the
> > meaning, but for the most part, it is only the 'original' form. :)
> >> If you removed those from the list, you'd still have
> something around
> >> 1300 words, maybe a little more or less than that.
> > Wow.
> Someday, hopefully, I will pass the thousand frontier ; ).
> > Our discussion prompted me to add a 'statistics' worksheet, just to
> > see what I had. I won't bore you with the full details, but
> a brief look:
> > 916 Entries, 162 of which are "Names"/"Proper Nouns". I
> need to dig in
> > and work my way through the swadesh list and all the weekly
> vocabs I
> > haven't done yet...:)
> > Feaelin
> What about giving for each word/entry an example phrase?
> Specially for verbs, it would be useful for showing the
> prepositions... the case of the object, and for exercising
> the fluency, too...
> Edgard Bikelis.