Wed Aug 6 21:20:34 BST 2008

Coma semantics

The semantics of Coma is defined in terms of concrete step by step
deterministic rewrite rules.  This post tries to find the relation to
declarative rewrite rules and place everything in a more formal
setting.  For a more concrete step-by-step exposition of how this is
implemented, see:


The goal of the compilation process is to translate a source
concatenative term s into a target concatenative term t.
Concatenative terms are either concatenations of concatenative terms
or primitive terms.

Let  s = s0 s1 ... | s'   source programs
     t = t0 t1 ... | t'   target programs

Here s, t are terms and s', t' are primitive terms. Denote the set of
s and t as S and T, and the set of s' and t' as S' and T'.

Basicly, pure concatenative translation maps primitives to primitives
S' -> T'.  The complete compilation expands a source concatenation
into into a concatenation of primitives which then one by one can be
translated to a target primitives.

In order to make this a bit more interesting, we allow substitutions
to occur in the language and make the translation more general.
Substitutions replace a subconcatenation with another one.

   Define the translation system {p,r} consisting of a map p : S' -> T
   together with a collection r of rewrite rules on T.

The map p associates primitive source terms with concatenations of
primitive target terms.  The rewrite rules r allow the construction of
a reduction system that translates the concatenation of terms p(s0)
p(s1) ... obtained from the source program s = s0 s1 ... to a term t_n
in normal form.

The alternative approach where substitution rules are defined on S
followed by a direct primitive map S'->T' is equivalent, so we
can concentrate on rewrite rules in T only.

Now, the implementation in Coma is a bit different, and modeled after
standard Forth macros.

   Define the translation system {m} where m : S' -> T -> T maps each
   source primitive to a target code transformer T -> T (1)

In uncurried form, Coma replaces the concatenation operation by a
binary operator T x S' -> T.  Given a source program S = S0 S1 ...
this operator can be used to create a whole program concatenation
operator S0' x S1' x ... -> T by folding the binary operator over S
initalized with the empty program.

The question is, how are {p,r} and {m} related? The rewrite system r
obviously needs an algorithm to implement it and some constraints that
allow the existance of normal forms for all terms t.  I think the
interesting questions are then:

  * What are these constraints and how to derive the representation of
    the reduction algorithm {m} from a higher level declarative
    representation {p,r}?

  * Is it useful to implement it like that, or is a more direct
    pattern matching approach good enough? (2)


(1) Stack semantics in Coma comes from an extra constraint that gives
    the code transformers T -> T some locality by having them operate
    only at the end of the primitive code concatenation, essentially
    viewing the target program string as a stack.  The specification
    language for primitive transfomers in Coma enforces this, by
    building on top of Scat.  However the general principle exposed
    here is applicable to more general point-free languages.

(2) The Coma pattern matching approach {m} is not exactly limited.  A
    central idea in Coma is that the (directly) expressed rewrite
    rules encode 2 kinds of equivalence relations.  The first one is
    partitioning of target programs with the same target semantics.
    For PIC18 the resulting reduction is mainly a reduction of program
    size. This is a fairly straightforward OPTIMIZATION technique.
    The second relation collapses concatenation of pseudo instructions
    that have no individual target semantics into constructs that do.
    This is the PARAMETERIZATION part of Coma.