Re: Shareable/centralizable dictionary server software? (WAS: Size of your dictionary)
From: | Alex Fink <000024@...> |
Date: | Saturday, April 4, 2009, 3:47 |
On Fri, 3 Apr 2009 20:32:21 -0700, Sai Emrys <saizai@...> wrote:
>On Fri, Apr 3, 2009 at 8:19 PM, Alex Fink <000024@...> wrote:
>>>* every entry belongs to
>>>- root(s) (e.g. _kitaab_ -> *ktb; same can be used for etymologies)
>>
>> But there should be a more flexible etymology feature (one that lets me
>> specify an exact preform, or irregular developments, or ...) too. Even if
>> just a flat text field, though that's unintelligent.
>
>I think you misunderstood.
>
>My proposal is that every entry can be derived from (i.e. belong to)
>multiple other entries (typically just 1, but hey).
>
>So for example if you have an entry for qux (in modern fooish), you
>could say that it derives from another entry, qukh (in middle fooish),
>with a sibling kukh (which means something else in modern fooish),
>etc.
This is a good start. But what I was getting at is that there's more to an
etymology than a source word. For one I may want to track the precise
proto-form, which is more information than just a single lemma in the
lexicon: maybe my word came from a _particular_ derivational or inflectional
form of a proto-wordd, maybe it came from a coalescence of two proto-words,
etc.. For two I may want to remark on particularities of the development
itself, like irregular sound changes.
Two is reasonably addressed by just adjoining another text field. One
depends on how your model treats forms as opposed to the stems they come
from -- how does it? (This bears on the irregular forms thing too.)
>>>... and has:
>>>- an xsampa, UTF8 romanization, and UTF8 custom font form
>>
>> UTF8? Who's gonna have their conscript in Unicode?
>
>My presumption (perhaps inaccurate?) is that any custom font will use
>Unicode underlyingly - i.e. you type some string of Unicode and it
>outputs as something fancy in that font.
>
>So this would be just <font face="yourspecialfont">foobár</font>
>really. Simplest way I could think of supporting it.
Oh fine. That is of course Wrong from the Unicode purists' perspective, though!
Alex