Re: Dublex (was: Washing-machine words (was: Futurese, Chinese,
From: | Jeffrey Henning <jeffrey@...> |
Date: | Wednesday, May 15, 2002, 15:45 |
And Rosta <a-rosta@...> comunu:
> Do you take into account frequency (of how many tokens of the root occur
> in a text of a given size) as well as productivity (the number of
> compounds a root occurs in)?
Since I am still writing the Dublex reference grammar, I have not done very
many translations yet. I was thinking of using English word frequency data,
but because of English polysemy that would only be useful if done on
semantically tagged corpora.
My initial analysis will be based solely on productivity, as that is
decidedly simple to calculate from my lexicon.
> I can see how your methods allow you to discard roots that prove to be
> relatively useless
Every compound in the dictionary using a deprecated root will need to be
replaced with a new compound. For instance, if 'nomin' ("noun") is
deprecated, as seems likely, a word like 'nominsint' (noun+science)
"grammar -- the study of how words combine to form sentences" will be
replaced by a new compound; perhaps 'vocjorsint' (word+connection+science).
Theoretically it would be easy to write a translator to translate texts in
older versions of Dublex to the new standard by simplying replacing
deprecated compounds with current compounds.
> but what do you do with potetially useful roots
> that didn't make it into your basic inventory in the first place?
To be added to the list of roots, a candidate root must have more compounds
than the root that is being deprecated. So to suggest a root for Dublex you
simply need to suggest a form, its meaning and enough compounds to ensure
its "election". Here's an example I have been thinking of:
dorv [English 'door', Hindi /drav/ and Russian /dvir/.] door -- a swinging
or sliding barrier that will close the entrance to a closet, room, building,
outbuilding or vehicle
vicdorv [vehicle+door] door -- a swinging or sliding barrier that will close
off access into a car
dorvcand [door+opening] doorway, door, room access, threshold -- the
entrance (the space in a wall) through which you enter or leave a room or
building; the space that a door can close
dorvmahin [door+machine] door opener -- electromechanical or electronic
device for automatically opening a door, as one to a garage
dorvdon [door+gift] door prize -- a prize awarded by lottery to the holder
of a ticket purchased upon entrance to a party
dorvfonhim [door+sound+tool] doorbell, bell, buzzer -- a push button at an
outer door that gives a ringing or buzzing signal when pushed
dorvnic [door+professiona] doorkeeper, doorman, door guard, hall porter,
porter, gatekeeper, ostiary -- someone who guards the entrance to a building
dorvcact [door+action] entrance, entering, entry, ingress, incoming -- the
act of entering
And that's just in 15 minutes work. Clearly 'dorv' is a productive root.
For a formal replacement, it probably makes sense to choose 30 roots at
random and see which of the two roots can form more compounds from those 30
roots. This is necessary because even though I have 5100+ compounds, those
have been added in a haphazard fashion. ("Let's add words for washing
machines! Let's add words for nations!") By randomly sampling 30 roots and
forming the compounds from them, one will be able to objectively demonstrate
that root X is more productive than root Y.
> Lastly, I have the impression that in natlangs that have relatively
> small inventories of roots, these roots tend to have rather fuzzy
> meanings that get applied to new concepts by means of chains of
> polysemy. (As a ready, though imperfect, example, see the American
> Heritage list of Indoeuropean roots, available online somewhere at
> bartleby.com, I think.) The result is that roots tend to suggest
> (with varying degrees of vagueness) rather than determine the
> meanings of words. Is Dublex like this?
I think the Dublex terms are much more precise. You can see the formal
definitions here and judge for yourself:
http://www.langmaker.com/x.htm#lav
But in actual usage the roots do get generalized in compounds, and I plan on
formally documenting and adjusting for that. For instance, right now the
entry for 'lav' says:
lav: washing -- the process of cleansing using water and/or chemicals
[From Latin 'lavatio', extant in Romance (Spanish, Italian), English and
auxiliaries (Esperanto, Novial).]
I plan on analyzing every root and how it has naturally been extended in
compounds and adjusting the definition accordingly. So I might redefine
'lav' as follows:
lav: cleaning, cleansing, cleanup, washing -- the act of making something
clean, especially using water and chemicals
I might even say that "washing" is now 'vatlav' (water+cleaning).
I love the description "engineered language" (does that make me a language
engineer?!) and I appreciate any input on how to quantitatively measure the
affectiveness of Dublex's minimal lexicon.
Best regards,
Jeffrey
http://jeffrey.henning.com
http://www.langmaker.com