Re: affixes
From: | Mark J. Reed <markjreed@...> |
Date: | Saturday, February 19, 2005, 21:01 |
On Sat, Feb 19, 2005 at 02:18:38PM -0500, # 1 wrote:
> I've tried as hard as I could but I'm totally unable to understand how does
> works a stack-based syntax
Well, it's probably easiest to start with stack-based arithmetic; then
the extension to language may make more sense.
Consider an arithmetic expression like this:
1 + 2 × 3 - 4 ÷ 5
Anyone past middle school has learned that there is a standard order of
operations in such expressions. In English, we learn the phrase "My
Dear Aunt Sally" to remember that Multiplication and Division come
before Addition and Subtraction. To change the order, or just to make
it explicit, you can use parentheses (...), which say "do what's
between us first". The above is the same as this:
1 + (2 × 3) - (4 ÷ 5)
So to find out what the answer is, you first multiply 2 by 3, giving 6,
and divide 4 by 5, giving 4/5 or 0.8, and then replace the parenthesized
subexpressions with the results:
1 + 6 - 0.8
Which is then easily added to yield the answer of 6.2.
So "1 + 2 × 3 - 4 ÷ 5" is "Mathematish" for something like "multiply two
by three; then divide four by five; then add one to the first result and
subtract the second result from the sum." As you can see, stating it
in English is more awkward than using the symbolic notation.
Expressions like "2 × 3" are said to be written in "infix" notation.
That's because when you are meant to perform an operation on two numbers
- called a "binary operation" - the symbol that tells you *which*
operation to perform (+, -, ×, ÷) goes *between* the two numbers you are
meant to perform the operation on. So "multiply two by three" is
written "2 × 3". This is the normal way of writing things down in
mathematics, but there are other options: prefix notation ("× 2 3") and
postfix notation ("2 3 ×"). Both prefix and postfix notation have the
advantage that they're never ambiguous, and you never need parentheses.
The long expression I started with becomes this in prefix notation:
- + 1 × 2 3 ÷ 4 5
Read it from left to right. First, we find out that we are going to be
subtracting one thing from another. Then we find out that we're going
to get the first number by adding some other numbers up. The first
number we add is 1. The second number in the addition is going to be the
result of a multiplication, and the two numbers to multiply are given
directly as 2 and 3. So now we can do the multiplication, yielding 6,
and the addition, yielding 7. Have to continue to find out what we're
subtracting from 7, though. The ÷ tells us it will be the result of a
division operation, and the two arguments are again given directly, so
we divide 4 by 5, yielding 0.8, and subtract that from 7, giving 6.2.
In postfix notation, it looks like this:
1 2 3 × + 4 5 ÷ -
And right away you may see a big difference. In evaluating a prefix
expression, we have to remember what operations we're in the middle of
until we finally get numbers to plug in. In evaluating a postfix
expression we never have to remember an operation, but we do have to
remember numbers, which we encounter before - sometimes long before - we
know what we're going to be doing with them.
There is a data structure in computer science called a "stack" that is
perfect for this sort of evaluation. It is a place to store data that
you can only do two things with: give it a piece of data (a number,
in this case), or ask it for the last piece of data you gave it. You
can only get the data back out in the opposite of the order in which
you put it in. So if you give the stack the number 1, then the number
2, and then the number 3, and then ask it for a number back, it will
give you 3 first, then 2 the next time you ask, and finally 1.
The name comes from the spring-loaded push-down stacks you see for
dishes and trays in large cafeterias - the next dish or tray you pull
off the stack is always the last one you put there, because you can only
get to the top one at any given time; the others are all down in the
hole.
So to process a postfix expression with a stack is easy. Whenever you
see a number, you put it in the stack (this is called "pushing", again
by analogy with the dish stack). Whenever you see an operator
(+, -, × or ÷), you pull (or "pop") the last two numbers out of the stack,
perform the operation on them, and push the result back on the stack.
Let's try that on the expression above:
1 2 3 × + 4 5 ÷ -
The stack (I'll call it S, and the "top" will be on the left) starts out
empty:
S:
First we see a 1, which is a number, so we push it on the stack:
S: 1
Then we do the same with the 2:
S: 2 1
And the 3:
S: 3 2 1
Then we see a ×, so we pop off two numbers and multiply them together.
The first pop gives us 3, the second 2; we multiply them together and
push the result back on the stack:
S: 6 1
Then we see a +, so we pop off two numbers and add them together,
pushying the result back on the stack:
S: 7
Then we see numbers and push them:
S: 5 4 7
Then we see ÷, which tells us to pop two numbers off and divide them.
Note that the first number we get back from the stack is the divisor,
not the dividend - we want 4 ÷ 5, not 5 ÷ 4.
S: 0.8 7
Finally, we see -, which tells us to pop two numbers off and find their
difference. Again, note that the first number we get back from the
stack is the subtrahend, not the minuend - we want 7 - 0.8, not 0.8 - 7.
S: 6.2
Which is our final answer.
Fith is designed like postfix notation. Nouns get pushed on the stack,
and operands like postpositions and verbs and adjectives operate on
them.
I don't know Fith, so I'll create an example with English words and just
put them together postfix style. Prefix notation is sometimes called
"Polish notation" because a Polish mathematican first proposed it, and
postfix notation is therefore called "Reverse Polish Notation", or "RPN"
for short. So let's call this syntax variation "RPNglish."
Consider a sentence from part 4 of the McGuffey Reader:
The fat hen is on the box.
In RPNglish, then, you might put it this way:
Hen fat the box the on is.
Let's go through it with the stack:
The word "hen" is a noun, so we push it:
S: hen
The word "fat" is a modifier, so we pull the noun, apply the modifier,
and push the result (a phrase) back on the stack. (The quotes indicate
that even though there are two words, it's just one item on the stack.)
S: "fat hen"
The word "the" is also a modifier, so we do the same thing:
S: "the fat hen"
The word "box" is a noun, so we push it:
S: "box" "the fat hen"
Then we hit another "the"
S: "the box" "the fat hen"
The word "on" is a preposition, although in RPNglish it's really a
postposition; it turns the noun on top of the stack into an adverbial
describing a location:
S: "on the box" "the fat hen"
Finally, we have the verb "is", which takes two arguments off the stack
and creates a single concept which is the linking of the two arguments;
S: "the fat hen is on the box"
Now, one vital quality of such a language is that you have to be able to
tell immediately, upon encountering a word, whether it is an "operand"
(to be pushed on the stack) or an "operator" (to modify what's on the
stack). This was clear in this simple example, but it isn't true of
English in general, because most words have multiple definitions
spanning multiple parts of speech. So a postfix language needs either
to have morphological indications of function, or to disallow
cross-functional homonymy.
-Marcos
Replies