Theiling Online    Sitemap    Conlang Mailing List HQ   

Re: Programmers requested for dictionary

From:Boudewijn Rempt <bsarempt@...>
Date:Friday, October 27, 2000, 19:03
On Fri, 27 Oct 2000, Peter Clark wrote:

> This is a cry for help: as my language, Enamyn, grows, so do my > problems with my dictionary. Paper and pencil just don't cut it. I have > been looking on the web for a dictionary program, or more accurately, > for a dictionary creator. So far, no luck.
I have done a very simple dictionary program (Python CGI/MySQL) for the Valdyan dictionary Irina uses - it doesn't have everything you ask for, although if you're reasonably adroit with SQL queries, you can get everything out. A more ambitious project, which Taliesin has mentioned already, is Kura: Kura will be able to do most of what you want, and more. It can already link between texts and lexicon, for instance. So you can click on a word in the text and find its definition in the dictionary, _and_ see all lines where that word occurs in all texts in that language. I'll take your requirements, and look at how they fit with the current snapshot of Kura:
> I know there are several programmers on this list. (I sincerely > wish I was one of them!) With all this talent, I don't think that it > would take too long to create a cross-platform dictionary creator and > reader. (I use Linux, but such a program should work on Windows and > Macs.) Think of the benefit for the whole list! All we need is for > several programmers to come forward and lead the project.
It's Linux only, for the moment: the gui needs KDE 1.1.2 (but I'm converting to cross-platform Qt 2.2.1). You need an MySQL database. I think it would be foolish to try and construct some multi-user data storage by hand.
> It wouldn't need a gui at first, although later down the road > that would be nice. Here's what I would like to see, feel free to add > your own ideas: > - Data entry with automatic sorting. Sorting should be by the > "alphabet" used; for instance, English abcde, Russian abvgde , etc. So > there would have to be some way to setup the program to understand what > word is being entered in what language so that it would sort correctly. > (It should also be able to handle things like sh, th, ch, dzh, etc.)
That's a presentation matter: simply rig the select so it returns the data in order. That's how Irina's (http://www.valdyas.org/irina/valdyas/taal/dictionary/index.html) and my (http://www.valdyas.org/andal/languages/denden/grammar/lexicon.html) dictionaries are produced. This can be as fancy as your report-writing abilities ;-).
> - Cross-linked entries. This should be automatic. If I enter a > word in Enamyn (let's say "vyl" /vl=/), then give it's translation as > "one, single, alone," then under the English section, "vyl" should be > listed under those three words. There should also be some way to enter > phrases, too.
More difficult. There are some provisions for this, but I'm working on a better solution. Basically, the problem is that there is seldom a one-to-one mapping.
> - Search > - Entries would be able to display conjugations, mutations, > endings, etc. I will lapse into Russian here, since I haven't developed > Enamyn far enough. Let's say I remember that the genitive plural of > "djengi" (money) is irregular, but forgot what it was exactly. I should > be able to type in "money" and learn that it is "deneg."
That's in it, but it can't get out - I haven't done the data-entry and presentation code, but the logic is there.
> - Ability to show words in "native" fonts (would probably have > to wait for the gui).
This is more difficult than you might think - and not only because X font handling isn't as modern as you might like. But it is certainly doable, and will be in Kura once the conversion to Unicode-aware Python 2 and Qt 2.2.1 is done.
> - For the gui, it would be nice to click on a word in the entry > and be taken to its definition. This would be handy for such cases like > "hot," which has three Russian words listed in my dictionary: > "gorjachij," "zharkij," and "ostrij." By clicking on them, I would learn > that they mean "hot (solids and liquids)," "hot (air)," and > "spicey."
Yes, that's basic, that's already available.
> Again, I wish I was a programmer--I've been teaching myself C, > but haven't gotten very far. (I'm still working on my calendar program > for the Enamyn calendar.) Next on the list is either perl or python, but > that won't be for a long while.
I'd skip C and go for Python immediately - at least, if you're going for results. Very little beats Python for pure productivity - certainly not C, nor Java, nor C++ nor any of the other language I have done things in. (On the other hand, C is not as bad as Befunge 98...)
> However, if our motives are just, and our hearts are pure, I am > sure that this list can create a decent dictionary program.
A basic dictionary program, with a PyQt gui, that does what you ask _now_ (and nothing more) is not more than a few days work with Python. If you take it as a language-learning exercise, it would be a bit more challenging. Doing it a language that requires hand-crafted memory management and doesn't play fair with strings will take weeks. Boudewijn Rempt | http://www.valdyas.org