Theiling Online    Sitemap    Conlang Mailing List HQ   

Re: General Purpose Dictionary Generator

From:Gary Shannon <fiziwig@...>
Date:Friday, October 27, 2006, 15:10
--- Alex Fink <a4pq1injbok_0@...> wrote:

> On Thu, 26 Oct 2006 01:17:52 -0700, Arthaey Angosii <arthaey@...> > wrote: > > >On 10/26/06, Alex Fink <a4pq1injbok_0@...> wrote: > >> This is a great idea. I'd actually been recently thinking that I should > >> convert my conlangs' lexica to a more structured format (they're currently > >> in un-marked-up and inconsistently-formatted human-readable text files) so > I > >> could process them by computer; this would be perfect for that. > > > >It's also similar to my own conversion from Shoebox to XML. I'm > >mid-conversion, but I do have an XML Schema. Perhaps it can be used as > >a basis for this program, or at least to spur discussion? In either > >capacity, it might prove helpful. > > I'd forgotten about Shoebox; it might be a good idea for this program to > accept Shoebox input in some form, perhaps by first running it through a > converter like (or identical to?) yours. > > I remember getting the impression the last time I looked at Shoebox's format > that it was interlinear-centric (which makes sense). IIRC the main > definition field is the gloss, suitable for interlinear use, and of course > you can also have a proper definition and more explicatory notes but it's > the gloss that's primary. It looks like your schema follows this, and > Gary's proposals seem to have a similar leaning ("hazy" as definition, > "marked by the presence of haze" as note). My own preference would be to > make the longer definition primary and the gloss/metalanguage search key > secondary; this way it's the language-internal divisions of semantic space > and not the equivalences to some other language that are at the forefront.
> > Looking through your tech page it looks like you've actually got a number of > components of what Gary's planning to write. Are they very > Asha'ille-specific and specific to your formatting, or could they > generalise? We might have starting points for a number of aspects of the > project at hand, in the latter case. > > Alex >
Arthaey, You XML schema looks good. Alex, re: "My own preference would be to make the longer definition primary" What I have in mind for the dictionary converter is that anyone could set up the database in any way they wanted, with whatever fields they care to define. The primary fields are whatever you declare the primary fields to be. The formatting program will do whatever you want it to do with whatever fields you tell it to use. (See ) The program itself will make no assumptions about what a field "means". It's just a box that holds some text. If you want that column to hold something you call "definition" then that's what you put in the column. How you use that column determines what it means to you. Where you put it in the dictionary format template determines how it appears in the final dictionary. I might use column 1 for part of speech and you might use column 1 for English word. It makes no difference to the database program or dictionary formatter. Either way it's just a column of text items to be inserted, searched, modified, formatted, printed, or turned into a PDF or HTML output file. --gary