Sat Aug 22 08:15:00 CEST 2009

Remarks about the PF interpreter

Refer to entry://20090816-120957

Note it is important to distinguish the byte code graph data structure
from an AST or the untyped s-expression encoding of _symbolic_ code.

The SEQ type can have sharing (shared subroutines) or loops (direct or
mutual recursion), while an AST or an s-expression is really a flat

    - symbolic source code like:  ``1 dup cons''

      here the symbols are translated by a dictionary (vocabulary)
      that maps symbol -> SUB.  symbolic code is a _tree_ represented
      as an _untyped_ s-expression.

    - `byte code graph'.  Each symbolic source code definition can be
      compiled to a single SUB (subroutine) type, which is _directly
      linked_ to other SUB types by data sharing.  Because it's a
      graph, this data structure is not easily printed.

    - AST: because the original syntax is so simple, the intermediate
      form one would traditionally use in an interpreter = a data
      structure with a recursive type that represents the same _tree_
      as a source code s-expression, is _not_ present in the current
      PF implementation.  Compilation goes straight from symbolic code
      as non-annotated s-expressions to the `byte code graph'.


      One could _implement_ a SEQ(foo,bar) as the following machine

                  call <address of foo>
                  jump <address of bar>

      Note that the `first' slot in a SEQ will push the return stack
      when it points to another SEQ.  If foo were a prim, the `call'
      instruction could be replaced by inlined machine code
      implementing the primitive, or a call to a machine code

      In the case 'bar' is a subroutine that's used only once, it
      could be inlined to eliminate the jump.

      Alternatively, if the machine has a literal push instructiom, an
      even simpler representation would be something like:

                  push <address of bar>
                  jump <address of foo>