Theiling Online    Sitemap    Conlang Mailing List HQ   

One major NLF2DWS flaw: density / compactness

From:Sai Emrys <sai@...>
Date:Thursday, July 6, 2006, 7:22
One (the only?) major flaw I've thought of so far in a NLF2DWS of the
sort I want is that it would not be very dense, as in, it would have
relatively few symbols per square inch compared to linear text.

I'll break this down into a few issues:

1. Unused (blank) space

Linear text is excellent (probably at or near maximum) at filling the
physical space on the page with the maximum number of symbols
(letters) and leaving very little space unused.

A nonlinear graph-type system would of necessity (I think?) have some
blank space; possibly significant amounts, depending on design,
because you would need to make its structure very obvious, and density
can interfere with that.

This may be offset by a sufficiently clever design, e.g. using color
or structural differences to make the various levels "pop"
differently, so that they can still be very dense and not interfere
with the perception of that.

2. Density of symbols vs of semantics.

I believe that a NL system would be significantly denser - or better
put, more efficient - semantically. That is, if you take a look at any
body of text (like this email), you'll notice that it's got a *lot* of
fat in it that would simply be unnecessary and redundant (or
inapplicable) in a NL system - e.g. (self-)references, explicitly
writing out thought structure, etc. I am not sure how much of a
compensation this would be, though.

This is very hard to prove though, absent a working copy. My
understanding is that within all linear writing systems, and within
all spoken or signed systems respectively, there is very little
variation in information density. If anyone can supply good
cross-linguistic data for this, I'd be quite interested.

(The test is, how long - or how much space, if text - does it take to
communicate a given story? What I've seen shows that it's about the
same between systems, but I've not seen much and the differences would
be fairly important as clues to what can be optimized.)

3. Physcal compactness of symbols

Symbols in non-NL systems (ha) are arbitrary; hence, they can be made
quite small because their only requirement is to be recognized, and
recognized as a whole at that (viz cogsci results on letter / word /
sentence level cognition and processing times).

One of my NL desiderata is to have symbols be grammatically accessible
(or symbolic, as it were) at a finer level. This would mean that you
may need a finer level of detail easily perceptible, and therefore
comprehension of symbols may degrade at an earlier level (i.e.
physically larger) than you would have otherwise .... but perhaps not.
I think this again would be very heavily dependent on implementation,
and on the entropy level of the text in question. Non-NL text has a
pretty low entropy, which makes it far easier to skim. Which leads

4. Maximum entropy

What is the most entropy people can handle before having impaired
cognition, or for that matter, what is the most entropy you can obtain
in any particular mode of language?

Or put a more interesting way: Given that the total time to comprehend
a message is:
(total data / entropy) * (comprehension rate at that entropy),
and that the latter is presumably a static function across all humans
(with some variation for cognitive entrainment)

... what amount of entropy minimizes the total read time?

(I'm using entropy here in the technical sense used in compression
algorithms and some linguistic analysis, i.e. the anotonym of

To use less technical terms:

Text that is very dense - that is, that carries a lot of information
in a small amount of space, rather than having a lot of bulk that you
could easily guess from context (e.g. an advanced neuroscience
textbook vs. a Bush speech) - takes more time to read. If you dilute
it, it takes less time to read any given amount of it, but you've also
taken in less information. So, how dense should the text be so that
you take in a given amount of information in the least *total* amount
of time?

As I said earlier, one of my major desiderata for a NL system is that
it be maximally entropic; this would of course aid in making it
visually compact as well.

I think that the NL nature of it, and the addition of other visual
cues, may make measuring its entropy computationally relatively
difficult (because of the much different context cues); the flip side
of course is that I *think* you may be able to get people to handle
more entropy than they would in a non-NL text - that is, perhaps that
function is not so static after all.

I wonder whether hitting the maximum (or rather, going above optimum)
entropy is something to be concerned about in this. I think probably
not, because if you do, you could always scale back down by fluffing
it up with some redundancy or more contextual cues (or just more blank
space to use for spacing and artistry).

Note also that the optimum may vary, depending how much premium you
put on reader comprehension speed vs. encoding speed vs. space used.
This is exactly analagous to data compression.

Related issues (familiar with programmers from database design) would
be the speed of read, insert, delete, move, specific find, and full
iteration operations. (Remember, we are talking about this speed IN
HUMANS, so the answers may well be different than computational ones.
Also remember, some of these may be computer assisted, so that
previous sentence might also be wrong. ;-))

There may be some other aspects to this that I haven't thought of here.

If you can think of some, or if you have any comment / reply /
resolution / data, or more pro/con discussion on these aspects, I'd be
delighted to read it.

If any of that was incomprehensible (or incorrect), please let me know
and I'll try to say it better.

 - Sai