Theiling Online    Sitemap    Conlang Mailing List HQ   

Re: XML for linguists?

From:David G. Durand <david@...>
Date:Tuesday, November 23, 1999, 4:46
At 6:05 PM -0500 11/20/99, And Rosta wrote:
>David Durand: >It sounds like you can enlighten me on some things I'm ignorant about >but have been wondering about. > >I learnt SGML back around 1989/1990 when I was working as an advisor for >the TEI, but subsequently didn't keep up to date, so I've never learnt what >HTML and XML are. My impression of HTML was that it was essentially a >(somewhat crappy) SGML DTD, and of XML that it is a less crappy DTD. Is >that right, or have I fundamentally misunderstood?
HTML is a crappy DTD, plus some non-SGML syntax rules that you have to know about because the HTML developers never used SGML software. XML is like full SGML (not a tagset, but a full definition language with DTDs and everything), but it's had 90% of the cruft from SGML stripped out. SGML had a million complicating features intended to allow typing shortcuts that complicated the grammar unbelievably. It took most companies that built them at least 6 man-years to create conforming SGML parsers. Most XML parsers are created by single programmers in about a month and a half.
>Working in Word over the last 6 or 7 years, I always thought how much more >useful it would be if it was SGML-based, not only because document exchange >would be so much easier but also because for both my professional work and >my conlang documents it would be useful to be able to define one's own >DTD and markup, and integrate this with the features (e.g. typographical) >that come built-in in Word. Is this sort of thing becoming feasible in >a hassle-free way, at all?
Not really. There are a lot of XML tools coming along, many free, but most are not quite end-user tools at this point. Especially in the document preparation area. Don't pin your hopes on the soon to be announced XML support in office 2000. It looks like it will be XML-complaint with a fixed DTD offering little more than RTF does.
>[I haven't labelled this message CHAT, because it is relevant to the >creation of electronic conlang documents, such as dictionaries & grammars.] > >--And. > >p.s. When the TEI project was being developed, another group somewhere >were working on a similar project for multimedia, called HiTime (or >something like that). What became of that?
The ISO created the HyTime standard based on full SGML. Steven DeRose and I wrote a book about it ("Making Hypermedia Work: A User's Guide to HyTime -- maybe still available from Kluwer). HyTime was the source of some good ideas (but too off topic to continue here). I wouldn't pin my hopes on it though it is interesting to read about. -- David _________________________________________ David Durand dgd@cs.bu.edu \ david@dynamicDiagrams.com http://www.cs.bu.edu/students/grads/dgd/ \ Director of Development Graduate Student no more! \ Dynamic Diagrams --------------------------------------------\ http://www.dynamicDiagrams.com/ MAPA: mapping for the WWW \__________________________