Re: Lexicon counting (was: Weekly Vocab #1.1.1...)
From: | H. S. Teoh <hsteoh@...> |
Date: | Monday, September 4, 2006, 2:50 |
On Sun, Sep 03, 2006 at 09:07:53PM -0500, Feaelin Moilar wrote:
> > This has boosted the lexicon to 267 entries.
>
> I've occasionally wondered and only now been motivated to ask, when
> one counts the entries, what do you count? I'm presuming only one form
> of the word (in the situation of conjugation, declensions, and the
> like) and myself I would exclude an whole group of entries in my data
> that are "famous names".
[...]
I've asked the same question before. The Tatari Faran lexicon has
several 'common proper noun' entries for common native names, as well as
irregular word forms whose root form isn't straightforward. There are
also some phrases for common expressions. I recognized this early on,
and made my lexicon tool omit these entries from the official lexicon
entry count.
However, a trickier problem remains with the TF complements: a class of
words that serve as (mandatory) clause terminators and predicate
reinforcers. Although they have English glosses, they are (almost)
never taken literally, being there only to add nuance and overtones.
Should they be counted as "real entries" or not? Every verb and
adjective is paired with one or more complements (which may or may not
be unique across verbs/adjectives). As of this writing, there are 201
unique complements, 191 verbs, and 82 adjectives. Many pairings are
unique, and so the complements don't really add to what can be expressed
in the language. OTOH, they do add nuance, esp. in non-unique pairs
where varying the complement gives different nuances to the same clause.
If you include them, it gives a false impression of how much can be
expressed in TF, since it's really equivalent to a much smaller language
without complements. But you can't completely omit them either, since
they *are* an integral part of TF's vocabulary.
[...]
> Since the original reason I tinkered with a conlang was to give a
> consistent feel to the fictional names I was using, early on, 'famous
> names' made up most of the entries. Now, the other entries represent
> about 4/5th's of my data, and the names are the rest. :)
Nice. The TF lexicon doesn't have that many proper names. And it seems
that conventions differ in natlang dictionaries: for example, many
English dictionaries don't list common proper names, but some do list
names of well-known people (usually historial or significant in some
other way). Some dictionaries may even have different editions that
include/exclude these entries.
So it seems the jury is out on this one.
T
--
Why ask rhetorical questions? -- JC