Mon May 26 14:20:11 CEST 2008

Unrolling Forth

The most difficult task in Staapl has been finding a bridge between
the layered, non-reflective, cross-compiled structure of Coma (lexer,
parser, compiler, interpreter + Scheme code written using PLT's
hierarchical module system) and the more traditional self-hosted
reflective Forth approach.

This ``unrolling'' makes explicit metaprogramming simpler, since it
creates a static code structure, compared to a dynamically updated one
that is updated on the fly.  Standard Forth is essentially imperative,
and is read from top to bottom[1].

It is not possible to ``unroll'' the reflectivity in ANS Forth
completely [2].  In order to write an ANS standard Forth, some of this
reflectivity needs to be restored.  The problem is mostly parsing
words.  Traditionally, immediate words (macros) have access to the
input stream, which is not possible in the current Staapl approach
since lexing is a separate step.  After lexing, only preprocesser
macros (prefix parsers) have access to the word stream, while ordinary
macros can only generate (and transform) machine code.  As a
consequence, Staapl's Forth cannot be standard[3].

This makes Staapl more complicated.  Is this complication justified?

Yes. The simplicity of the purely function Scat model and its name
hygiene are more important than any inconveniences that are a
consequence of Coma being different[4] from standard reflective Forth.
Everything is built on top of Coma/Scat in layers to make
metaprogramming easier, and to integrate better with PLT's bias
towards a static, layered non-reflective approach.

[1] The irony is, that trying to perform this unrolling step, I have
    come to appreciate more the elegance of the standard reflective
    Forth approach.  Forth's compactnes comes exactly from leaving the
    intermedate representations out, and maximally exploiting
    reflectivity.  There seems to be much debate even in the Scheme
    community about the trade-offs between interactive ``toplevel''
    program development and the declarative module approach as used in
    PLT Scheme. (The DrScheme "run" button, or "require" vs "load").
    This debate is also quite central to the design of Staapl.

[2] It is always possible to unroll any reflective sequence by
    providing one language layer per statement.  However, that doesn't
    really solve much, so the implied condition is to create not too
    many layers.

[3] There is however a way around this by using only the Staapl
    machine model, which boils down to writing a standard ANS Forth
    compiler that compiles to Coma code.  This could be extended with
    a run-time simulator to make this cross-compiled Forth behave more
    like a standalone one.

[4] I still find the concept of parsing words useful as a _user
    interface_ so Staapl contains some front-end to write code that
    looks like traditional Forth, by introducing a ``prefix parser''
    layer that emulates Forth's parsing words, and a lexer that
    provides the basic whitespace separated Forth tokenizing.