Mon Aug 24 17:16:15 CEST 2009
SEQ -- why encode the byte code as a binary tree instead of nested lists?
(From a discussion with Dominikus.)
Why use a binary tree instead of a list to represent byte code?
We start with the assumption that we agree on representing byte code
as a recursive data type (i.e. as opposed to using low-level arrays
and pointers/indices), and that we agree on wanting proper tail call
handling so loops can be expressed as recursive calls without blowing
up the context stack.
To do this correctly, given a concatenative function specified by the
code sequence "a b c ... z", the _last_ procedure `z' needs to be
handled in a special way. Before `z' is invoked, no context needs to
be saved because there is nothing to do afterwards. This is in
contrast with the function calls `a', `b', `c' ... which need the
computation to resume and thus require a context save.
Now, in the case of a compiled language, you have two options. You
either perform this special treatment of the last element in the list
at run time, or you perform it at compile time and represent the code
in a data structure that doesn't need any special case handling by the
This is exactly what SEQ does: it provides a data structure which can
be interpreted without special-casing. The `handling' of tail calls
is done at compile time only, by encoding the program as a binary
tree. Interpreting the binary tree has only one case: the first
branch needs context save, while the latter one doesn't.