Re: from Cucumis.org: Help: Language equivalences
From: | Jim Henry <jimhenry1973@...> |
Date: | Friday, March 7, 2008, 20:08 |
On Fri, Mar 7, 2008 at 2:25 PM, <MorphemeAddict@...> wrote:
> Language equivalences
>
> This is the list of the number of characters needed to translate an English
> text of 100 characters into each language. These values determine the cost
> (number of points) of each text submitted. The values are automatically improved
That's better than no data, but it's not much to the point of whether
and how much texts tend to expand (or shrink) when translated into
another language. Basing it on length in characters rather than length
in phonemes or syllables or morphemes tells us very little, unless
we can compare two languages on the list with roughly phonetic
spelling (e.g. Spanish seems to be a bit more verbose than Finnish
or Esperanto by that measure). I don't know about Azerbaijani
or Indonesian, but Irish and French have a preponderance of
digraphs and silent letters that bias this kind of measure against them
(but then, so does English, the benchmark of this measurement system);
and all the lowest-ranking languages have logographic or syllabic
scripts.
Even for the purposes Cucumis seems to use it for, this seems
off-base especially w.r.t. the logographic/syllabaric languages. If
it's supposed to determine how much it costs to get something
translated, shouldn't they look for a measure of how similar or
different the languages are, rather than how concise they are?
(Or maybe how rare volunteer translators for a given pair of languages are.)
I can't believe translating something from English into Chinese,
a non-IE language, is necessarily a third as hard as translating
into Irish, another IE language, or half as hard as translating
into French or German, more closely related IE languages.
--
Jim Henry
http://www.pobox.com/~jimhenry