Re: Universal Language Dictionary revision
From: | Herman Miller <hmiller@...> |
Date: | Monday, December 11, 2006, 4:52 |
rick@harrison.net wrote:
> Herman, thanks for your comments. In my future "Hdict" project I'm
> going to have each item explicitly marked with its antonyms or other
> words that might be regularly derived from it. So "like" (the
> preposition) would be tagged with a link to "similar" (the adjective),
> and so forth. Concepts that can be expressed as compounds of other
> concepts will be tagged also; "volcano" will have a note indicating
> that some natlangs express it as a compound of "fire" plus "mountain."
Sounds like an ambitious project!
> These links will make automatic vocabulary generation easier, so
> everybody can have a conlang - or a thousand conlangs.
>
> I guess some pronouns can be added to ULD. Some natlangs don't have
> free-standing pronouns but I think all of the languages currently
> included in ULD do have them. Not sure about Tsolyani; don't have any
> literature about that language in my collection.
Tsolyáni has a bunch. You could just list them all, rather than trying
to describe the differences in rank.
I: lín, lú, lúm, lukán, salúm, kosalúm
we (inclusive): lúmi
we (exclusive): lúmama
you (singular): tsám, túsmi, túsmidàli, mìsritúsmidàli
you (plural): tlúmi, tlúmiyel, túsmidàli, mìsritúsmidàli
this; he, she, it: másun, máisur, komáisur, srǜnosanmáisurdàlidàlisa
these; they: mssúran, mssúri, komssúri
Some of these have limited usage: in particular, "kosalúm" and
"srǜnosanmáisurdàlidàlisa" are used in reference to the emperor or
empress only. Additionally there's a whole bunch of extra "special"
pronouns meaning "you" in various contexts.
> Hello, goodbye, thank you and so forth -- these would be tricky in
> some languages because you might have to choose from many options
> depending on your gender, social status, time of day and so forth.
> There aren't enough comment fields or room for annotations in the
> current ULD data structure, but I will try to think of an elegant way
> to add this kind of material to the future Hdict project.
Some of those would be trickier than others, but other words in the list
have the same problem (different ways to translate the same or related
ideas). It's just something I noticed when I was filling in the gaps in
my own vocabulary list by referring to the ULD; most gaps in the ULD
vocabulary such as "elbow" have an obvious place they would go (next to
related words such as "arm" or "knee"), but words like "please" and
"sorry" didn't seem to fit anywhere.
> Personally when I look at basic vocabulary wordlists, one lack that I
> see is interjections and discursive flavor words. On wikipedia there's
> a list of the 1000 most frequent words in TV shows and movies,
> apparently made by analyzing the text taken from closed captioning data
> embedded in the video. This is interesting because it approximates the
> way English is actually spoken by real people. The list includes: oh,
> yeah, okay, uh, huh, hey, hell, um, hmm, ah, damn, ha, whoa, wow,
> alright, mm, sh_t, f_ck, ooh, y'know, ow, and mmm. I was surprised that
> "oops" did not make the list.
That could be a useful list to refer to (but talk about being hard to
translate!) Still, some of these probably have equivalents in enough
languages (even if not in your average bilingual dictionary) that it
makes sense to include them. E.g. Japanese has a word "daijōbu" 大丈夫
that's a rough equivalent of "okay" or "all right" in English.
> How about the ULD's transition to XML and UTF-8? Pretty exciting,
> isn't it. I was determined to resist XML because it's so damn ugly, but
> then I thought, what the hell, I'll do something trendy for a change.
I don't know enough about XML to comment, but I'm sure it'll get some use.