Sun Aug 16 12:09:57 CEST 2009
New PF interpreter
The interpreter evaluates the following recursive data structure.
( Note that this represents fully resolved byte code: it doesn't contain
any variable references. )
SEQ = (SUB : SUB) ;; `:' is graph CONS
SUB = PRIM | QUOTE | SEQ
K = HALT | (SUB . K) ;; `.' is linear CONS
SEQ Code sequencing provides the encoding of what to do _now_ and what
to do _next_
SUB A subroutine is a basic unit of work: invoke a primitive function,
load a data object onto the parameter stack, or perform a
K The continuation is a stack constructed at run-time, representing
the rest of the evaluation, finishing with HALT.
The interpreter uses this data structure as follows (in pseudo code,
an ML implementation can be found here).
sub = POP(K)
SEQ (now : next)
-> K = PUSH(now, PUSH(next, K))
-> execute PRIM
-> load data item on parameter stack
SUB is the main code type (the subroutine). It is what you pass to
`run', and it is what can be found as resume points in the
continuation. It includes primitive code and data, as well as
composite code sequences (SEQ).
When a SEQ is encoded, the two SUBs will be pushed onto K in the
correct order. Note that this encoding automatically gives proper
tail call handling: loops can be implemented using recursion
without blowing up the K stack.
* * *
The above is the representation of compiled code.
Concatenative (Joy-like) code can be compiled straightforwardly to
SEQ-based code. This can be illustrated using the standard lisp
CDR-linked pair notation to represent both source code CONS pairs and
SUB pairs. The the symbolic program
(a b c d) = CONS(a, CONS(b, CONS(c, CONS(d, NIL))))
is compiled to an improper list
(A B C . D) = SEQ(A, SEQ(B, SEQ(C, D)))
where each symbol x is mapped to a corresponding X representing a SUB.
Resolution happens using a global vocabulary that maps symbols to
* * *
The code representation as a binary graph has several advantages
related to the similarity between binary graphs and binary trees (PF's
linear pair data structure for data storage and representing
the machine continuation).
- SEQ as a pair of SUBs gives a very direct encoding of "now" vs
"next". This makes it really easy to represent continuations as
a stack of SUBs.
- The continuation K ``looks like'' a SEQ (a particular kind of
SUB), in that it is a linear pair of a SUB and a K.
- The machine can run without nonlinear data allocation:
continuation push/pops are linear.
- One-shot full and partial continuations (segments of the K
stack) are (isomorphic to) linear lists.
- This isomorphism obviously works also the other way: linear
lists constructed at run time can be converted to one-shot full
or partial continuations. (This can be used to emulate one-shot
- Linear one-shot closures/continuations can be _copied_ to make
- They can also be _compiled_ to code (linear pairs represented by
nonlinear SEQ pairs) so they can be reused any number of times.
However, this is not a linear operation.
( Note that the isomorphisms need some work to figure out quoting
issues. Essentially it will boil down to different kinds of
continuation frames. )