Tue Jul 21 12:43:20 CEST 2009

An approach to bootstrapping static semantics

Currently it's a bit muddy what exactly I mean with ``static
semantics'' but as I see now, this can be turned into an advantage.

1.  It is important that Staapl be, at the bottom level, a macro
    assembler.  One should not loose _any_ ground to abstraction, but
    improve on the current concept of `assembler'.  Concretely, what
    this needs in order to support higher levels is a way to make
    verification possible by defining concrete machine languages in
    terms of a common operational semantics (i.e. a simulator for each
    supported architecture).

    Bottom Staapl = macro assembler + simulator

    This is enough to provide ``dynamically typed specification by
    compiler'', which when done in an integrated environment (Scheme
    macros with proper name management) already improves quite a bit
    over current standard practice.

2.  Provide a means to raise the _static properties_ of the
    abstractions used, by facilitating the bootstrap of type systems
    or any other kind of static source code property.  Tools that help
    would take the form of verification (approximate, static) and test
    (dynamic) based on the low level operational semantics.

    Top Staapl = collection of tools to build higher level semantics

The overall goal of this is to make sure that a programmer can stay
within a certain ``safety level'', but has a straightforward way to
build abstractions that bridge low and high level by dropping out of
the safe zone in a controlled manner.  This is quite a general idea,
and there are many systems that attempt it.  In order to make it fit
in my mind, I make a couple of assumptions:

One assumption is that picking a combinator language as the part that
is able to drop to machine level is a good choice.  This started from
``gut feeling'' but I've found at least two good reasons to date:
1. it allows all issues that come with improperly closed lexical scope
to be avoided which makes it trivial to meta-program (the machine can
use it), and 2. as a language itself it is quite effective to build
reasonably complex yet well-factored systems (the human can use it).

Another assumption is that leaving the lowest layer _untyped_ is a
good idea: it keeps things simple and allows ``locally contained
hacks'' inside the machinery for higher level constructs, without
having to leave the system.  Note that this might disappear gradually
depending on how well the simulation based static analysis works out
in the end, but it is my belief that a certain amount of hackery is
going to be essential to get any highlevel structure implemented
efficiently.  Containing these hacks is what I see as the main problem
in electronics design and low-level systems programming.