Theiling Online    Sitemap    Conlang Mailing List HQ   

Re: One major NLF2DWS flaw: density / compactness

From:Yahya Abdal-Aziz <yahya@...>
Date:Friday, July 7, 2006, 14:58
Hi Sai,

On Thu, 6 Jul 2006 Sai Emrys wrote:

[much to think about!]


> One (the only?) major flaw I've thought of so far in a NLF2DWS of the > sort I want is that it would not be very dense, as in, it would have > relatively few symbols per square inch compared to linear text. > > I'll break this down into a few issues: > > 1. Unused (blank) space > > Linear text is excellent (probably at or near maximum) at filling the > physical space on the page with the maximum number of symbols > (letters) and leaving very little space unused. > > A nonlinear graph-type system would of necessity (I think?) have some > blank space; possibly significant amounts, depending on design, > because you would need to make its structure very obvious, and density > can interfere with that. > > This may be offset by a sufficiently clever design, e.g. using color > or structural differences to make the various levels "pop" > differently, so that they can still be very dense and not interfere > with the perception of that. > > 2. Density of symbols vs of semantics. > > I believe that a NL system would be significantly denser - or better > put, more efficient - semantically. That is, if you take a look at any > body of text (like this email), you'll notice that it's got a *lot* of > fat in it that would simply be unnecessary and redundant (or > inapplicable) in a NL system - e.g. (self-)references, explicitly > writing out thought structure, etc. I am not sure how much of a > compensation this would be, though. > > This is very hard to prove though, absent a working copy. My > understanding is that within all linear writing systems, and within > all spoken or signed systems respectively, there is very little > variation in information density. If anyone can supply good > cross-linguistic data for this, I'd be quite interested. > > (The test is, how long - or how much space, if text - does it take to > communicate a given story? What I've seen shows that it's about the > same between systems, but I've not seen much and the differences would > be fairly important as clues to what can be optimized.) > > 3. Physcal compactness of symbols > > Symbols in non-NL systems (ha) are arbitrary; hence, they can be made > quite small because their only requirement is to be recognized, and > recognized as a whole at that (viz cogsci results on letter / word / > sentence level cognition and processing times).
[YA] "non-NL"?! Why not just say "linear"?
> One of my NL desiderata is to have symbols be grammatically accessible > (or symbolic, as it were) at a finer level. This would mean that you > may need a finer level of detail easily perceptible, and therefore > comprehension of symbols may degrade at an earlier level (i.e. > physically larger) than you would have otherwise .... but perhaps not. > I think this again would be very heavily dependent on implementation, > and on the entropy level of the text in question. Non-NL text has a > pretty low entropy, which makes it far easier to skim. Which leads > to...
[YA] See my comments below on reading small text.
> 4. Maximum entropy > > What is the most entropy people can handle before having impaired > cognition, or for that matter, what is the most entropy you can obtain > in any particular mode of language? > > Or put a more interesting way: Given that the total time to comprehend > a message is: > (total data / entropy) * (comprehension rate at that entropy), > and that the latter is presumably a static function across all humans > (with some variation for cognitive entrainment) > > ... what amount of entropy minimizes the total read time? > > (I'm using entropy here in the technical sense used in compression > algorithms and some linguistic analysis, i.e. the anotonym of > predictability.)
[YA] You've lost me here! - Aren't "information" and "entropy" antonyms? - Aren't "predictability" and "unpredictability" antonyms? I correlate "information" with "form", "organisation" and "structure". I correlate "entropy" with "arbitrariness", "disorganisation" and "unstructuredness". "unsurprising" is a synonym for "predictable". "surprising" is an antonym for "predictable". "information" is essentially surprising; after all, no news is no *news*. "information" is, by its nature, unpredictable and surprising. I'm sure you're onto something, whatever you label it; have you ever tried reading through a massive government publication for sense?
> To use less technical terms: > > Text that is very dense - that is, that carries a lot of information > in a small amount of space, rather than having a lot of bulk that you > could easily guess from context (e.g. an advanced neuroscience > textbook vs. a Bush speech) - takes more time to read. If you dilute > it, it takes less time to read any given amount of it, but you've also > taken in less information. So, how dense should the text be so that > you take in a given amount of information in the least *total* amount > of time?
[YA] I think the answer is - it depends very much on the reader. Why else would Word come with readability statistics, like the Flesch-Kincaid readability score, or reading age? I know, as someone who writes daily for consumption by ordinary people, that I can't expect to be understood well enough if I pitch my writing at a level much beyond six or seven years of schooling. There's a probability dis- tribution involved here, probably a normal bell- curve or something quite similar. So, as always in communication, you will need to design a workable compromise between efficiency and efficacy. Some people will never get it; for some others, you'll be labouring the obvious.
> As I said earlier, one of my major desiderata for a NL system is that > it be maximally entropic; this would of course aid in making it > visually compact as well. > > I think that the NL nature of it, and the addition of other visual > cues, may make measuring its entropy computationally relatively > difficult (because of the much different context cues); the flip side > of course is that I *think* you may be able to get people to handle > more entropy than they would in a non-NL text - that is, perhaps that > function is not so static after all. > > I wonder whether hitting the maximum (or rather, going above optimum) > entropy is something to be concerned about in this. I think probably > not, because if you do, you could always scale back down by fluffing > it up with some redundancy or more contextual cues (or just more blank > space to use for spacing and artistry). > > Note also that the optimum may vary, depending how much premium you > put on reader comprehension speed vs. encoding speed vs. space used. > This is exactly analagous to data compression. > > Related issues (familiar with programmers from database design) would > be the speed of read, insert, delete, move, specific find, and full > iteration operations. (Remember, we are talking about this speed IN > HUMANS, so the answers may well be different than computational ones. > Also remember, some of these may be computer assisted, so that > previous sentence might also be wrong. ;-)) > > > There may be some other aspects to this that I haven't thought of here. > > If you can think of some, or if you have any comment / reply / > resolution / data, or more pro/con discussion on these aspects, I'd be > delighted to read it. > > If any of that was incomprehensible (or incorrect), please let me know > and I'll try to say it better.
Sai, a few more comments for you to ponder. I. Wrapping, linking and decomposing: ------------------------------------------- One of my major beefs with many printed works, including web pages, is that the text is too small to read - it has detail at a finer level than I can resolve visually. I can still cope with 11 pt text in most common fonts, though in some, I need 12 or even 14 pt; as a rule, 10 pt is barely legible and 8 pt is nearly illegible. An unanswered question I have on your ideas for a fully 2D writing system is this: is it bounded? By this I mean, the length of a linear represent- ation of an utterance is unlimited, but we solve the potential problem that causes by "wrapping" the line on the page. In this way, we can fit a sentence of arbitrary length on a page, or even a number of pages, if necessary. But the size of a 2D representation of an "utterance" in some formal languages (eg a system flowchart or structure diagram) is so great that it won't fit on a page. We do have several strategies for breaking such an utterance into a number of smaller, bounded pieces, each of which does fit on a single page. For example, we use the off- page connector in a flowchart, and we use a structured hierarchical decomposition in a program structure or dataflow diagram. So I hope you see where I'm headed? I want to be able to - 1. read what is written in your F2DWS, without calling on someone else or using a magnifying glass; 2. apprehend your main thought readily; 3. explore the details of that thought; 4. *ignore* the details of that thought. The last two points should give the WS quite a bit of power and utility. But it will be essential to develop conventions for wrapping, linking and decomposing any utterance to any desired degree of detail. ------------ II. Scan direction, or "What next?" ----------------------------------------- Here's another point that you may or may not have considered, and whose solution will depend on your choices for wrapping, linking and decom- posing just mentioned. Pardon me if you know all this stuff already! People have different strategies for scanning text when reading. I'm talking both physiology and habits here. When reading a full-size web page with long lines of text, most of us (more or less) read each line from beginning to end, then flip back to start the next line - exactly like a TV raster pattern, or a set of Zs stacked on top of each other. When reading lines that are short enough, for example in newspaper columns, most of us read down the *middle* of each column, some stopping once per line to "fixate" it, some only every few lines or even only a few times per column. The process of reading is, visually, very stop-start; it's not the smooth flow we usually think of it as. Efficient readers,"speed readers", are those who fixate the *fewest* possible number of times per page. Surprisingly, comprehension may not suffer, but actually increase, when one learns to speed-read. You don't actually need to focus on each word to understand it; in a given context, even non- foveal vision is enough to let you confirm that the "right" (ie expected) words occur where they ought. The "context" is a whole set of patterns in which words occur, that are appropriate to the way the kind of subject matter is usually treated in a certain kind of publication. (Shades of using "patterns" rather than "rules" in grammar!) This is one reason that reading is more accurately con- sidered a mental task than a visual task. Ok, enough background! However you organise your F2DWS, you should consider (as one aspect of its usability) that people need clear expecta- tions of how to scan the writing efficiently. It wouldn't be much help, for instance, if the writing could go off an arbitrary distance in an arbitrary direction while still saying something central to the utterance. --------------- III. Trivia: ------------- 1. A F2DWS is necessarily a NLWS. 2. Unless you plan to give up writing on paper or computer screens, a NLWS is necessarily a F2DWS. 3. Ergo, the phrase abbreviated NLF2DWS is a bit redundant. 4. Ithink NLWS is enough to convey the essential idea. Regards, Yahya -- No virus found in this outgoing message. Checked by AVG Free Edition. Version: 7.1.394 / Virus Database: 268.9.9/382 - Release Date: 4/7/06

Reply

Sai Emrys <sai@...>