Theiling Online    Sitemap    Conlang Mailing List HQ   

Re: Lexicon counting (was: Weekly Vocab #1.1.1...)

From:Iain E. Davis <feaelin@...>
Date:Thursday, September 7, 2006, 1:38
Edgard wrote:
> I used Excel as well, but I'm using the Openoffice equivalent > for the Unicode compatibility. Then I save it as a CSV file,
Hmm. To what degree did Excel not do what you needed unicode wise? I was able to come up with what I needed for IPA symbols, which was my main concern. I was happy from there. :) Not that I think you should change. If OpenOffice works successfully, awesome!
> and open it with PHP. That part is quite easy. Then I show it > through PHP, and I'm now fighting against inflection ; ). > Here is the result so far: > http://ausonia.parnassum.org/dicionario.tudo.php
Pretty. I've not anything so well constructed, web-wise. :)
> But I just realized that I can't reverse the entries. I guess > I should have a English dictionary, for instance, and link my > Ausonian entries to the English entries? Ouch.
The method I used for achieve that effect doesn't produce a English->Taraitola dictionary, or in your case an English->Ausonian one. What I did is one of my columns (in addition to the _definition_ column) is "English Word" column. This is the closest equivalent english word, if any. Then for the English->Taraitola "Cross-Index" a entries looks like: I: äm You: säm And my expectation is that you look up äm and
> > > >> haven't got the pronunciation and whatnot for the English > words), so > >> the main entry always an Ayeri word. The problem is that > my database > >> does not accept sub-entries, so every > >> > > > > I have the same issue, although I don't believe Taraitola has any > > constructions like tapiao, so it is less of a concern. > > > >> tapiao - to put; to set > >> tapiao dayrin - to save ("to put aside") > >> > > ...[snipped] > > > > Hmm. Since each of those have a distinct meaning, I'd argue that in > > terms of counting, you should count them all. :) > > > > I would add it as a derivative meaning from the same verb. > For instance: > > tapiao > 1. to put; to set. > 2. ~ <i>dayrin</i> - to put aside; to save. > > or something like that. > > > > >> Where my German-English dictionary would list all those > entries just > >> under "to put", my database makes a new record out of all of these > >> (unfortunately). > >> > > It probably would. But your English->German dictionary > wouldn't, so > > you have to make some sacrifices somewhere. :). I have > something of > > the same problem on the _other_ end. There are words that have > > distinctions that English doesn't make. So the 'english word' > > column/field can potentially have apparent duplicates. It doesn't > > matter too much to me, since the more important field is the > > 'definition' field. English word is merely for creating a > 'index' of > > english-->Taraitola words (no meaning or adornment, just a > pointer to the Taraitola word). > > > Could you not be more specific in the entries? Or it is a > word-to-word association? > > > >> As for names, I keep them in an extra list, so they are > not counted. > >> > > Common > > Which is a reasonable separation. Arguably, when I generate the > > dictionary, mine _are_ separated...into Appendix C: Famous > People and > > Places. In my data, though, the only real difference is > that they're flagged as 'C' > > entries (for appendix C) instead of 'A' entries. > > > > We did similar things, just different approaches. > > > Both are good ideas. I need to categorize the entries in this > way... by semantical domain. Do anyone know a good list of > classes for it? > > I would put the names on the main dictionary just for the > etymology. Who were the bearers of the name is a matter for > another publication : ). Do gods count on this restriction? > > > >> expressions usually have their own entries as well. There are not > >> many expressions listed in the dictionary, though, just a > >> > > I have very few expressions and in fact, they are all out of date > > since I've never revisited them since I completely revised the > > phonology. They are stored completely separately, but they > may pre-date the spreadsheet... > > > And these expressions are entries on the dictionary, or > 'sub-meanings' > on the main word? for instance: > > bhâma: > 1. that which is said, fame. > 2. ~ ambrotós - the immortal fame, used in poetry &c &c > > > >> handful. Futhermore, since Ayeri is an agglutinative > language, it has > >> lots of suffixes -- these are also counted as words, even the ones > >> that only have a syntactical meaning. > >> > > > > We differ here...as I mentioned to Henrik, I don't list any > suffixed forms. > > There are some exceptions where some affixes completely change the > > meaning, but for the most part, it is only the 'original' form. :) > > > > > >> If you removed those from the list, you'd still have > something around > >> 1300 words, maybe a little more or less than that. > >> > > > > Wow. > > > Someday, hopefully, I will pass the thousand frontier ; ). > > Our discussion prompted me to add a 'statistics' worksheet, just to > > see what I had. I won't bore you with the full details, but > a brief look: > > > > 916 Entries, 162 of which are "Names"/"Proper Nouns". I > need to dig in > > and work my way through the swadesh list and all the weekly > vocabs I > > haven't done yet...:) > > > > Feaelin > > > > > What about giving for each word/entry an example phrase? > Specially for verbs, it would be useful for showing the > prepositions... the case of the object, and for exercising > the fluency, too... > > Edgard Bikelis. >