Re: One major NLF2DWS flaw: density / compactness
From: | Yahya Abdal-Aziz <yahya@...> |
Date: | Friday, July 7, 2006, 14:58 |
Hi Sai,
On Thu, 6 Jul 2006 Sai Emrys wrote:
[much to think about!]
> One (the only?) major flaw I've thought of so far in a NLF2DWS of the
> sort I want is that it would not be very dense, as in, it would have
> relatively few symbols per square inch compared to linear text.
>
> I'll break this down into a few issues:
>
> 1. Unused (blank) space
>
> Linear text is excellent (probably at or near maximum) at filling the
> physical space on the page with the maximum number of symbols
> (letters) and leaving very little space unused.
>
> A nonlinear graph-type system would of necessity (I think?) have some
> blank space; possibly significant amounts, depending on design,
> because you would need to make its structure very obvious, and density
> can interfere with that.
>
> This may be offset by a sufficiently clever design, e.g. using color
> or structural differences to make the various levels "pop"
> differently, so that they can still be very dense and not interfere
> with the perception of that.
>
> 2. Density of symbols vs of semantics.
>
> I believe that a NL system would be significantly denser - or better
> put, more efficient - semantically. That is, if you take a look at any
> body of text (like this email), you'll notice that it's got a *lot* of
> fat in it that would simply be unnecessary and redundant (or
> inapplicable) in a NL system - e.g. (self-)references, explicitly
> writing out thought structure, etc. I am not sure how much of a
> compensation this would be, though.
>
> This is very hard to prove though, absent a working copy. My
> understanding is that within all linear writing systems, and within
> all spoken or signed systems respectively, there is very little
> variation in information density. If anyone can supply good
> cross-linguistic data for this, I'd be quite interested.
>
> (The test is, how long - or how much space, if text - does it take to
> communicate a given story? What I've seen shows that it's about the
> same between systems, but I've not seen much and the differences would
> be fairly important as clues to what can be optimized.)
>
> 3. Physcal compactness of symbols
>
> Symbols in non-NL systems (ha) are arbitrary; hence, they can be made
> quite small because their only requirement is to be recognized, and
> recognized as a whole at that (viz cogsci results on letter / word /
> sentence level cognition and processing times).
[YA] "non-NL"?! Why not just say "linear"?
> One of my NL desiderata is to have symbols be grammatically accessible
> (or symbolic, as it were) at a finer level. This would mean that you
> may need a finer level of detail easily perceptible, and therefore
> comprehension of symbols may degrade at an earlier level (i.e.
> physically larger) than you would have otherwise .... but perhaps not.
> I think this again would be very heavily dependent on implementation,
> and on the entropy level of the text in question. Non-NL text has a
> pretty low entropy, which makes it far easier to skim. Which leads
> to...
[YA] See my comments below on reading
small text.
> 4. Maximum entropy
>
> What is the most entropy people can handle before having impaired
> cognition, or for that matter, what is the most entropy you can obtain
> in any particular mode of language?
>
> Or put a more interesting way: Given that the total time to comprehend
> a message is:
> (total data / entropy) * (comprehension rate at that entropy),
> and that the latter is presumably a static function across all humans
> (with some variation for cognitive entrainment)
>
> ... what amount of entropy minimizes the total read time?
>
> (I'm using entropy here in the technical sense used in compression
> algorithms and some linguistic analysis, i.e. the anotonym of
> predictability.)
[YA] You've lost me here!
- Aren't "information" and "entropy" antonyms?
- Aren't "predictability" and "unpredictability"
antonyms?
I correlate "information" with "form",
"organisation" and "structure".
I correlate "entropy" with "arbitrariness",
"disorganisation" and "unstructuredness".
"unsurprising" is a synonym for "predictable".
"surprising" is an antonym for "predictable".
"information" is essentially surprising; after
all, no news is no *news*. "information" is,
by its nature, unpredictable and surprising.
I'm sure you're onto something, whatever
you label it; have you ever tried reading
through a massive government publication
for sense?
> To use less technical terms:
>
> Text that is very dense - that is, that carries a lot of information
> in a small amount of space, rather than having a lot of bulk that you
> could easily guess from context (e.g. an advanced neuroscience
> textbook vs. a Bush speech) - takes more time to read. If you dilute
> it, it takes less time to read any given amount of it, but you've also
> taken in less information. So, how dense should the text be so that
> you take in a given amount of information in the least *total* amount
> of time?
[YA] I think the answer is - it depends very
much on the reader. Why else would Word
come with readability statistics, like the
Flesch-Kincaid readability score, or reading
age? I know, as someone who writes daily for
consumption by ordinary people, that I can't
expect to be understood well enough if I pitch
my writing at a level much beyond six or seven
years of schooling. There's a probability dis-
tribution involved here, probably a normal bell-
curve or something quite similar. So, as always
in communication, you will need to design a
workable compromise between efficiency and
efficacy. Some people will never get it; for
some others, you'll be labouring the obvious.
> As I said earlier, one of my major desiderata for a NL system is that
> it be maximally entropic; this would of course aid in making it
> visually compact as well.
>
> I think that the NL nature of it, and the addition of other visual
> cues, may make measuring its entropy computationally relatively
> difficult (because of the much different context cues); the flip side
> of course is that I *think* you may be able to get people to handle
> more entropy than they would in a non-NL text - that is, perhaps that
> function is not so static after all.
>
> I wonder whether hitting the maximum (or rather, going above optimum)
> entropy is something to be concerned about in this. I think probably
> not, because if you do, you could always scale back down by fluffing
> it up with some redundancy or more contextual cues (or just more blank
> space to use for spacing and artistry).
>
> Note also that the optimum may vary, depending how much premium you
> put on reader comprehension speed vs. encoding speed vs. space used.
> This is exactly analagous to data compression.
>
> Related issues (familiar with programmers from database design) would
> be the speed of read, insert, delete, move, specific find, and full
> iteration operations. (Remember, we are talking about this speed IN
> HUMANS, so the answers may well be different than computational ones.
> Also remember, some of these may be computer assisted, so that
> previous sentence might also be wrong. ;-))
>
>
> There may be some other aspects to this that I haven't thought of here.
>
> If you can think of some, or if you have any comment / reply /
> resolution / data, or more pro/con discussion on these aspects, I'd be
> delighted to read it.
>
> If any of that was incomprehensible (or incorrect), please let me know
> and I'll try to say it better.
Sai, a few more comments for you to ponder.
I. Wrapping, linking and decomposing:
-------------------------------------------
One of my major beefs with many printed works,
including web pages, is that the text is too small
to read - it has detail at a finer level than I can
resolve visually. I can still cope with 11 pt text in
most common fonts, though in some, I need 12 or
even 14 pt; as a rule, 10 pt is barely legible and
8 pt is nearly illegible.
An unanswered question I have on your ideas for
a fully 2D writing system is this: is it bounded?
By this I mean, the length of a linear represent-
ation of an utterance is unlimited, but we solve
the potential problem that causes by "wrapping"
the line on the page. In this way, we can fit a
sentence of arbitrary length on a page, or even
a number of pages, if necessary. But the size of
a 2D representation of an "utterance" in some
formal languages (eg a system flowchart or
structure diagram) is so great that it won't fit
on a page. We do have several strategies for
breaking such an utterance into a number of
smaller, bounded pieces, each of which does fit
on a single page. For example, we use the off-
page connector in a flowchart, and we use a
structured hierarchical decomposition in a
program structure or dataflow diagram. So I
hope you see where I'm headed? I want to be
able to -
1. read what is written in your F2DWS, without
calling on someone else or using a magnifying
glass;
2. apprehend your main thought readily;
3. explore the details of that thought;
4. *ignore* the details of that thought.
The last two points should give the WS quite a
bit of power and utility. But it will be essential
to develop conventions for wrapping, linking and
decomposing any utterance to any desired degree
of detail.
------------
II. Scan direction, or "What next?"
-----------------------------------------
Here's another point that you may or may not
have considered, and whose solution will depend
on your choices for wrapping, linking and decom-
posing just mentioned. Pardon me if you know
all this stuff already!
People have different strategies for scanning
text when reading. I'm talking both physiology
and habits here. When reading a full-size web
page with long lines of text, most of us (more
or less) read each line from beginning to end,
then flip back to start the next line - exactly
like a TV raster pattern, or a set of Zs stacked
on top of each other. When reading lines that
are short enough, for example in newspaper
columns, most of us read down the *middle* of
each column, some stopping once per line to
"fixate" it, some only every few lines or even
only a few times per column. The process of
reading is, visually, very stop-start; it's not the
smooth flow we usually think of it as. Efficient
readers,"speed readers", are those who fixate
the *fewest* possible number of times per page.
Surprisingly, comprehension may not suffer, but
actually increase, when one learns to speed-read.
You don't actually need to focus on each word
to understand it; in a given context, even non-
foveal vision is enough to let you confirm that
the "right" (ie expected) words occur where they
ought. The "context" is a whole set of patterns
in which words occur, that are appropriate to the
way the kind of subject matter is usually treated
in a certain kind of publication. (Shades of using
"patterns" rather than "rules" in grammar!) This
is one reason that reading is more accurately con-
sidered a mental task than a visual task.
Ok, enough background! However you organise
your F2DWS, you should consider (as one aspect
of its usability) that people need clear expecta-
tions of how to scan the writing efficiently. It
wouldn't be much help, for instance, if the writing
could go off an arbitrary distance in an arbitrary
direction while still saying something central to
the utterance.
---------------
III. Trivia:
-------------
1. A F2DWS is necessarily a NLWS.
2. Unless you plan to give up writing on paper or
computer screens, a NLWS is necessarily a F2DWS.
3. Ergo, the phrase abbreviated NLF2DWS is a bit
redundant.
4. Ithink NLWS is enough to convey the essential
idea.
Regards,
Yahya
--
No virus found in this outgoing message.
Checked by AVG Free Edition.
Version: 7.1.394 / Virus Database: 268.9.9/382 - Release Date: 4/7/06
Reply