Sat Aug 22 08:15:00 CEST 2009
Remarks about the PF interpreter
Refer to entry://20090816-120957
Note it is important to distinguish the byte code graph data structure
from an AST or the untyped s-expression encoding of _symbolic_ code.
The SEQ type can have sharing (shared subroutines) or loops (direct or
mutual recursion), while an AST or an s-expression is really a flat
- symbolic source code like: ``1 dup cons''
here the symbols are translated by a dictionary (vocabulary)
that maps symbol -> SUB. symbolic code is a _tree_ represented
as an _untyped_ s-expression.
- `byte code graph'. Each symbolic source code definition can be
compiled to a single SUB (subroutine) type, which is _directly
linked_ to other SUB types by data sharing. Because it's a
graph, this data structure is not easily printed.
- AST: because the original syntax is so simple, the intermediate
form one would traditionally use in an interpreter = a data
structure with a recursive type that represents the same _tree_
as a source code s-expression, is _not_ present in the current
PF implementation. Compilation goes straight from symbolic code
as non-annotated s-expressions to the `byte code graph'.
One could _implement_ a SEQ(foo,bar) as the following machine
call <address of foo>
jump <address of bar>
Note that the `first' slot in a SEQ will push the return stack
when it points to another SEQ. If foo were a prim, the `call'
instruction could be replaced by inlined machine code
implementing the primitive, or a call to a machine code
In the case 'bar' is a subroutine that's used only once, it
could be inlined to eliminate the jump.
Alternatively, if the machine has a literal push instructiom, an
even simpler representation would be something like:
push <address of bar>
jump <address of foo>