Mon Aug 31 16:36:23 CEST 2009

Stackless Extensional Joy

I'm involved in a discussion with Dominikus about writing a
specification of Joy in terms of rewrite rules.  It is his opinion
(and I believe also Manfred's) that it is necessary to explicitly use
a data stack in the description.


    * The discussion was resolved by making a clear distinction
      between language semantics (map from syntax to some
      representation domain, i.e. unary functions) and purely
      syntactic (meaninless) manipulations.  The original MvT paper[2]
      doesn't make this distinction either (it talks about rewriting
      to specify _semantics_ not to specify legal syntactic operations
      in a formal system without involving semantics.)

      In the rules below, a purely syntactic rewriting system only
      gives you the ability to state that two concatenative forms are
      equivalent.  Then later you can attach a meaning to that (they
      represent the same program).

    * Proper treatment of quotations is important as a language design
      issue (early / late binding), but when you're specifying an
      operational semantics, of course any one would work.  I.e. in
      the discussion it was irrelevant and this put me on a side

    * In my understanding, the discussion needed a clear definition of
      value (irreducable expression) and redex (reducable expression).
      Using a stack to specify this makes things simpler, but is not
      necessary.  However, in the specification I produce, a stack can
      be immediately seen to emerge.  Conclusion: if (purely
      syntactic) reduction rules are of the form

              a .. m A -> n .. z

      With `a' .. `z' values and `A' a non-value, the values on the
      left of `A' can be _interpreted_ as a stack, as can the values
      on the right of '->'.

   In that light, here's what I wrote (the first section about
   quotations is irrelevant).

                             * * *

It is mine that this is not necessary.  What _is_ necessary is a clear
treatment of quotations, respecting the non-isomorphic map from syntax
to meaning.  Thusfar this has been a series of hunches.  I'd like to
see why I really believe this (me being wrong is OK too..)

The problem appears to lie with the fact that the homomorphism S which
attributes meaning (a function) to syntax (a concatenation of
primitive symbols) is not one-to-one.  I.e.:

       S( [ 1 dup ] )  =  S( [ 1 1 ] )

Is this related to ``never reduce under lambda'' ?

I.e. there is only one _expansion_ operator: `i'.  All the others are
combinators that _re-arrange_ and _contract_ things.  It is not
allowed to perform _any_ substitutions _inside_ quotations.  The only
place where this is legal is in the current expansion of the program:
a flat list of tokens and quotations.

Does this help to eliminate the data stack?

I think I'm on the wrong track: to eliminate the data stack it
probably doesn't matter much whether code is intensional or
extensional.  What matters is that it needs to be really well-defined
what a _value_ is.

I.e. what does `swap' do actually do?  Can you explain that _without_
referring to the stack?  Syntactically, the quotation "a b swap" can
be reduced to "b a" but _only_ if both `a' and `b' are values.

Isn't this really just LR(1) parsing?

In [2] manfred alludes to all this..  However, going on about
stackless semantics, he says: ``This is the key for a semantics
without a stack: Joy programs denote unary functions taking one
program as arguments and giving one program as value. ''

This seems to be the ``staged'' interpretation.  A program denotes a
function, and evaluation is finding the simplest syntactic form of
this function.

                             * * *

1. values

It seems that what really matters is to distinguish values from
non-values, not necessarily using a stack.  Applicability of rewrite
rules representing `stack words' depend on leading symbols being

        a b swap == b a

only if `a' and `b' are values.

If reduction order doesn't matter (confluent rewrite rules), then
picking an order that leads to a simple algorithm seems like a
reasonable default approach.

However, my position is still that the stack is purely an artifact of
the way you sequence the reductions.  In MvT's words[2]:

  ``It is clear that such a semantics without a stack is possible and
  that it is merely a rephrasing of the semantics with a stack.
  Purists would probably prefer a system with such a lean ontology in
  which there are essentially just programs operating on other

So I guess I'm wearing the purist hat for a change..

The reason I do is that there might be a benefit in generalizing this
to non-confluent systems that will give you a set of possible

2. left to right order

One way to do the reuctions is from left to right.  Values are skipped
while redexes recombine with the values.  This will then effectively
yield a stack of values as a result.

I.e. in the following, everything on the left of '|' is fully reduced
(the stack) and the rest are possible redexes (the code).

| 2 5 + 3 *
2 | 5 + 3 *
2 5 | + 3 *
    7 | 3 *
    7 3 | *
       21 |

This left-to-right approach guarantees that a rule that needs values
on it's left side will allways be applicable.  The general rewrite
system that starts at any place in the code can't do this: some
symbols that need values on the left side won't have them (yet).

3. other orders?

So, what is the benefit of general rewrite rules, without specifying
an explicit sequencing order?

It seems that Forth's parsing words (i.e. `variable') use the _other_
direction: here symbols can modify the semantics of symbols to the
right.  Also non-confluent rewriting systems that describe
optimizations and not language semantics seem to need a non-local

Another useful point is to use global pattern matching to find
isolated words: if the reduction order doesn't matter, then (in a pure
functional setting) these segments are value-isolated and can be run
in parallel.

                             * * *

So, words are not values, but values are words.


   A value is word that does not produce a reducable expression if it
   is right-appended to a sequence of values.

It's a bit anti-climactic maybe, but I think that's the essence..

Dominicus calls this ``constructor functions'', but it requires a
definition that depends on a data stack: a domain that is used to
chain functions of the semantic space together.

[1] http://www.latrobe.edu.au/philosophy/phimvt/joy/j00ovr.html
[2] http://www.latrobe.edu.au/philosophy/phimvt/joy/j07rrs.html