Boudewijn Rempt wrote:
> I've been working on and off on ideas for a bit of software to help in
> analyzing and describing a language,along with my ideas for an ideal
> grammar (
http://www.xs4all.nl/~bsarempt/conlang/dream.html), and I've
> prepared a design draft (which is also intended for consumption by the
> non-conlang linguistic community). I'd like you all to comment upon it:
> The Summer Institute of Linguistics has for years provided the gratis
> software package Shoebox, which is a capable single-user linguistics
> database and analysing tool. However, this is a closed-source
> application that only runs on non-standard and volatile environments
> like Microsoft Windows and Apple Macintosh.
You are too kind to M$ and Apple ...
> Also from SIL is CELLAR,
SIL also has KIMMO and PC-PATR or some such, which work in Unix IIRC.
But they do different things than you propose.
> The language of development is Python, the back-end any SQL database.
> The current implementation of the back-end is on MySQL, but
> PostgresSQL, Oracle, Sybase, DB2, mSQL or Ingres should work too, as
> long as there is a standard Python DB interface available. Two separate
> interfaces are intended: a web-server based interface and a graphical
> interface for the Unix KDE desktop. A modular design will facilitate
> the development of other interface components.
Postgres database and web interface (CGI?) are good!
Anyone can install RedHat Linux to get these automatically.
Your software can be packaged in an RPM for easy installation.
> Linguists can use Python to add their own modules to Kura.
I fail to suppress my Perl preference here ...
> Unicode support will be an essential feature of
> the finished application.
Agreed. So it's python + qt ...
> Functions
>
> * The administration of recordings
> * The administration of scanned manuscript data
> * The entering and administration of transcribed data, aided by the
> recorded and scanned data.
> * The semi-automated morphological and phonological analysis of
> transcribed data, with reference to the underlying recorded and
> scanned data.
> * The entering and administration of general linguistical notes,
> related to analysed data.
> * The production of interlinear texts in XML, HTML and plaintext
> formats.
> * The production of bilingual lexicons, etymologies and comparative
> lexicons.
> * The querying of analysed data for phonological, morphological,
> syntactical and lexical phenomena, within and across languages.
> * The administration of attributions and references to work within
> and outside the application.
>
> Entities
>
> * Administrative Entities
> + Language
> + Linguist
> + Project
> * Language Data
> + Sound files
> + Graphics files
> + textual (transcribed data).
> * Phonetic data
> * Lexical data
> * Structural (grammatical) data
I am totally lost in wonder. It looks like several man-years of work,
and of a highly specialized linguistic variety.
> Lawler, John M. and Helen Aristar Dry. 1998. Using Computers in
> Linguistics. Routledge.
Any chance of getting J.L. involved directly?