Theiling Online    Sitemap    Conlang Mailing List HQ   

Re: brz, or Plan B revisited (LONG)

From:Jonathan Knibb <j_knibb@...>
Date:Friday, September 23, 2005, 13:05
Fellow enge-/loglangers,

I am no logician and have not read Jeff's paper. However, if there is
discussion concerning ways of encoding a binary syntactic tree in a linear
string of words, then I faced exactly this problem when designing my T4
(which I prefer to think of as an engelang rather than a loglang), and
perhaps it may be relevant to mention my solution here.

The system depends on a couple of properties of T4 syntax - firstly that
branch points are exclusively binary, and secondly that right-branching
structures are much commoner than left-branching structures.  Rightward
branching is therefore taken as default. The system works by marking each
word with a particular pitch accent.

For example, in the tree:

{  [ A (B C) ]  [ D E ]  }

... the words A, B and D join the tree rightwards (or, equivalently, are
preceded rather than followed by parentheses), while C and E branch
leftwards. A, B and D are therefore unmarked (default accent). C is marked
as 'final word of first half of sentence' (indicated by an acute accent in
writing), and E as 'final word of sentence' (final accent, not explicitly
marked but assumed before a full stop).

Interestingly, for most of the sentences I have actually written so far
(few, I admit, as T4 is *still* in a fragmentary state), these three
markings - default, end-of-first-half, end-of-sentence - suffice. There are
also two other markings indicating that a word branches leftwards: to the
word immediately preceding it but no others (grave accent), and to some
intermediate extent (circumflex accent). Consider the (rather unlikely)
structure:

[ (A B) ( {C [D E]} F )] [G H]

For clarity, the top-level brackets are omitted. Here, A, C, D and G take
accent 1, and H accent 3, as above, and F closes the first half of the
sentence and therefore takes accent 2.

B is the second word of a two-word phrase, but is not the final word of the
first half of the sentence, and is therefore marked with the grave accent. E
is in an intermediate position - it is the final word of a three-word phrase
(CDE) but again does not close the sentence's first half, so is eligible for
no accent given so far. In this situation, the circumflex accent is used,
along with various means of disambiguation in the very unusual cases where
the size of the preceding phrase is unclear.

The examples are coded as:
A B C' D E.
A B` C D E^ F' G H.

Clearly this is a compromise between elegance and efficiency. I'm not aware
of any similar system out there - does anyone recognise it? (Surely no
ANADEW?)

best,
Jonathan.