Theiling Online    Sitemap    Conlang Mailing List HQ   

Re: TECH: dumb html question

From:Mark J. Reed <markjreed@...>
Date:Wednesday, January 14, 2004, 14:03
On Tue, Jan 13, 2004 at 03:28:03PM +0100, Benct Philip Jonsson wrote:
> At 14:34 12.1.2004, Mark J. Reed wrote: > >RM> Does "&schwa;" exist? > > > >Nope. Although one of the nice things about XHTML is that XML lets you > >define your own entities, so that you can add it if you like. > > How is that done? Can it be done in a stylesheet?
The later replies on this thread by me and Tristan had an example, but here's what is required: 1. A browser that will apply HTML-type rendering to an XHTML document. It appears that Internet Explorer doesn't do this. If it recognizes a document as XHTML and parses it as an XML document, all you get is pretty-printed source, rather than a visual rendering. Or, if it parses it as an HTML document, then it is rendered, but XML-specific tricks like custom entities don't work. If anyone knows a way to convince IE to both parse as XML and render as HTML, please let me know. Mozilla/Firebird/Gecko works fine; I haven't tried Opera or any other browsers. 2. The document must be recognized by the browser as XHTML, not just HTML. That means the Content-Type sent by the web server has to be "application/xhtml+xml", not "text/html". A modern web server will do the right thing if the file is named with a .xhtml suffix; any browser that meets condition 1 will probably also do the right thing if you open a local file with such a suffix. Note that if the web server is not configured properly, you can't fake it with a <meta http-equiv> element, because by the time that element is processed the browser has already decided whether it's HTML or XHTML. 3. The document has to be legal X[HT]ML. Among other things, this means that the very first thing in it - no leading whitespace, even -has to be the XML processing directive: <?xml version="1.0"?> An XML document is assumed to be UTF-8-encoded Unicode by default; if you're using another character set, such as Latin-1, you must say so in the XML directive, like so: <?xml version="1.0" encoding="iso-8859-1"?> After that you need a <!DOCTYPE> directive (See below), and then you can finally get things rolling with the opening tag of the <html> element - which needs some extra attributes. The rest of the document has to be valid XHTML: lowercase element names, all empty tags explicitly marked, all attributes with double-quoted values, etc. 4. Any custom entities are defined in the DOCTYPE directive. XML defines an entire (infinitely large) family of markup languages; the <!DOCTYPE> directive tells the parser which particular language is in use for the document containing it, by pointing to a formal description of that language (called a Document Type Definition, or DTD). For describing a web page, the particular language is XHTML, but even then, there are several dialects to choose from. If your web page is using any of the older presentation-type markup (<body bgcolor=>, <b>, <i>, etc; basically anything that is supposed to be done with stylesheets these days), you need to label it as XHTML 1.0 Transitional: <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" ""> Otherwise, you should label it as XHTML 1.1: <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "xhtml11.dtd"> You also need to specify the default namespace and language in the <html> tag: <html xmlns="" xml:lang="en"> The first time you're trying to get all this working, you should probably make sure the document validates as-is before trying to add the custom entities. You can check it at Custom entities are defined by adding <!ENTITY> declarations inside the <!DOCTYPE> declaration. For instance, if you want entities for the IPA vowel symbols: <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "xhtml11.dtd" [ <!ENTITY alpha "&#x0251;"> <!ENTITY talpha "&#x0252;"> <!ENTITY openo "&#x0254;"> <!ENTITY reve "&#x0258;"> <!ENTITY schwa "&#x0259;"> <!ENTITY eps "&#x025B;"> <!ENTITY reveps "&#x025C;"> . . . <!ENTITY ups "&#x028A;"> . . . ] > Then you include those entities in the text just like the built-in ones: <p>Phonemically, the word &lt;about&gt; is /&schwa;'b&alpha;&ups;t/</p> -Mark