Tue Sep 9 12:55:40 CEST 2008

old partial evaluation explanation

( This was removed from the blog and replaced with a post that
stresses the BINDING and QUASIQUOTATION mechanisms used to implement



So, how does it work?

       PE from (greedy) deterministic pattern matching
                 a typed template language

So, by fixing the algorithm used to implement PE, a language emerges
that is useful for other kinds of code generation. Let's spin that out
a bit.

PE in a concatenative language is quite straighforward: function
composition is associative which makes evaluation order a parameter to
play with. Compositions of pure functions can be performed at compile
time to generate composite functions that are more efficiently
implemented, while other compositions can be postponed to run time due
to dependency of dynamic data (1).

This is because concatenative syntax allows to abstract away the
composition method: a function can always be trivially inlined,
instead of being invoked at runtime using a the run-time composition
mechanism: the machine's function call (2). When inlining multiple
functions, there can be an opportunity for specialization by moving
some computations to compile time. For example, inlining the functions
[ 1 ], [ 2 ] and [ + ] produces an inlined composition [ 1 2 + ] which
can be replaced by the composition [ 3 ]. This is automatic program
specialization by Partial Evaluation in its purest form.

In Purrr, the Partial Evaluator is not implemented as a separate
component. PE is a consequence of the actions of the machine model,
which is specified in terms of primitive Purrr macros, which implement
a map of recently generated code (the top of the compilation stack) to
new code to be generatied (placed on the compilation stack). These
primitives are expressed in a language with deterministic pattern
matching. It allows the specification of the following compiler

  * target code generation
  * peephole optimization
  * partial evaluation
  * generic parameterized template instantiation

The first 3 could be counted as components of pure partial evaluation.
The last one however is not: it is an interface that connects the
concatenative macro language to explicit code generation tools. It
allows the use of templates that have no target semantics unless they
are parameterized.

Why is this useful?

Say you want to implement 'cos' as a function of two arguments like

    cos ( angle scale -- value )

Realizing that a true 'cos' function is never used in the target code
because the scale can be fixed and is available at compile time, it
can be implemented as a template that generates a lookup table and
code to lookup the value. If later generic cosine routines are
necessary, this template macro can be extended to compile a call to
library code in case the parameter is not available at compile
time. One can be surprised how many times this pattern occurs: due to
the lack of target support for specific primitive abstractions it is
often easier to write something as a template for specialized
code. Note that this is different from programming for non-embedded
systems where this primitive functionality is usually available.

The advantage of doing it this way is that the code is easier to read:
code expresses semantics more easily without instantiation annotation
getting in the way. This annotation can be expressed somewhere else in
the form of 'forth' and 'macro' mode indicators. The disadvantage is
that a lot of code will be pushed towards implementation as a
macro. If this is taken too far, possible sharing might be
endangered. For that reason, moving between macro and instantiated
target code is made really straightforward in Purrr, but it remains an
explicit operation under programmer control.

Explicit code generation in Purrr is useful when

  * partial evaluation becomes too hard to do automatically
  * some on-target primitives are not available
  * algorithms are hard to express in concatenative syntax

So as long as it is possible to express a general algorithm in the
purely functional macro sublanguage, the built in PE can be used to
specialize the code. The advantage here is that the difference between
compile and run time can be silently ignored as an implementation
detail. However, in practice in some cases it might be easier to make
the code generation process a bit more explicit. In it is made very
straightforward to plug in arbitrary Scheme code for parameterized
code generation.


Stack languages are interesting for writing parameterizable low-level
code because the composition mechanism is so simple:

 * They are very straightforward to implement on the target
   architecture with a small run-time footprint of two stacks.

 * Automatic specialization through partial evaluation is very
   straightforward to implement off-target.

 * Implementing the code generator (including PE) using deterministic
   pattern matching exposes an interface that can be reused for
   plugging in aribitrary parameterized code generators.

In Purrr, the code generator language is Scheme. Within Scheme all of
Purrr and the underlying compiler is exposed: you can decide to
generate (pseudo) assembly code, Purrr code, or interface to external
code generators.


(1) Of course, the Purrr target semantics is not purely functional. It
    contains language primitives that introduce machine state
    (determined by world state) through a side channel into the
    concatenative language. This is no problem for PE, since it merely
    limits PE to those the subset of pure functions.  Procedures that
    depend on other parts of the machine state aside from the
    (threaded) parameter stack simply have to be instantiated, and
    cannot be performed at compile time.

(2) Except when it interferes with the implementation of the run-time
    function composition method, i.e. modifies the return stack. A
    more correct statement would be that the subclass of pure
    functions can always be trivially inlined.