Theiling Online    Sitemap    Conlang Mailing List HQ   

Re: "Self-Segregating Syntax"?

From:Eldin Raigmore <eldin_raigmore@...>
Date:Friday, April 21, 2006, 18:57
On Fri, 21 Apr 2006 02:21:57 +0100, And Rosta <and.rosta@...> wrote:
[snip]
>As for the unambiguous parser, the details have naturally changed greatly >over time, but there are some constants. >I. I work with a Dependency Grammar model of syntax, in which syntactic >structure is a tree (without crossing branches) and there is no >distinction in type between furcating nodes, unary branching nodes and >terminal nodes. (This is mainly a notational issue, but it makes for >maximal simplicity & straightforwardness.) >II. I stick to the principle of no lookahead, which helps to ensure that >any parsing algorithm is straightforward & unmindboggling. >III. Not all nodes need be expressed phonologically. > >In the antepenultimate incarnation of the syntax, all mothers preceded >their daughters, and the lexicogrammar specified for each node how many >daughters it has. This is a variety of your [3a] above. > >In the penultimate incarnation of the syntax, mothers could precede or >follow their daughters. The lexicogrammar specified for each node how >many daughters it has, and whether it follows its mother or has no >mother or precedes its mother as first daughter or precedes its mother >as nonfirst daughter. This is a mixture of your [1] & [3] (but with the >'length' encoded on what you might call the 'head'). > >In the current incarnation of the syntax, mothers follow daughters, all >mothers have exactly two daughters, and mothers form a closed lexical >class (which happens to be expressed inflectionally). This could be >classed as your [2] or your [3]. > >Anyway, it should be clear from the above that the aim is always to >find an unambiguous algorithm for building a tree without lookahead. >And the solution always involves a combination of constraints on tree >shape plus lexicogrammatical information about the combinatorial >properties of individual nodes. It's easy to find algorithms that work: >the design challenge is to find the optimal solution (which needs to >factor in compositional semantics). >[snip]
Of all replies up 'til this one, this reply is most responsive to the intent of my original question about how to unambiguously assign a tree to an utterance. (Most other techniques offered so far would seem to apply only to a tree of at most two levels. While this problem has been mentioned by several others, nobody else up 'til now, as far as I could tell, had offered a solution.) Thank you. I wish I could see it in operation; also I'd like a little more detail about the generalities of your techniques you mentioned above. ----- What is Dependency Grammar, exactly? and what publications describe it best? and can you detail a little better, perhaps with some examples, why it helps out on this question? ----- In Category Grammar, which I understand is "equivalent", somehow, to Tree- Adjoining Grammar, a member of a non-elementary category is an operator that intakes a fixed-length list of fixed-position operands, each of a particular previously-defined category, and outputs a member of some previously-defined category. In natlangs, it appears that the most popular positions for the operator (within the list of operands), are the following: 1. Immediately before the first operand. 2. Immediately after the first operand. 3. Immediately before the last operand. 4. Immediately after the last operand. Position 1 is "prefix position", also known as "Polish Notation". Position 4 is "postfix position", also known as "Reverse Polish Notation". Positions 2 and 3 are "infix position". Obviously: if there is only one operand, then Position 1 = Position 3 and Position 2 = Position 4; and if there are only two operands, then Position 2 = Position 3. Also, unless there are more than three operands, there is no position which is _not_ on the above list. ----- The position that the operator takes among its operands, is part of the definition of the operator-type; as is the number of its operands, and as are the types of its operands. It sounds like you're saying that in your conlangs these facts about the operator-type are always "phonologically coded" into the word for a particular operator. Have I understood you? eldin

Reply

And Rosta <and.rosta@...>