Sun Sep 14 16:43:08 CEST 2008

Can you sell a language?

( Appeared on the blog on Sun Aug 24 14:39:43 CEST 2008 but removed
  because if its rambling style.  This problem is much deeper than I
  assumed, and I need to take another approach. )

This article is an attempt to clear out some ideas about design
choices in Staapl, from the _cultural_ point of view.  What is culture
and what has pure technical merit?  It turns out this is not so easy
to answer.

I don't remember who said it, but it was at a keynote talk at MWSCAS
2000: ``As engineers, we should be aware about not falling in love
with our approach.''  It's an idea that I very much needed to hear,
and which taps me on the shoulder from time to time.  Ask yourself this
question: how much of your approach is just tradition?

A couple of days ago I ran into this post on LuT:

A nice display of insight.  This made me think a bit about the real
merit of Staapl, aside from the cultural aspects tied to the Scheme
and Forth paradigms (it's cleaner).  See the previous post that tries
to identify the design choices, illustrating possible other approaches
to the problem Staapl solves.

Of the 4 trade-offs illustrated, I would say the first two are
technical and the last two are cultural.  The choice between library
or language is what Frank Atanassow stresses in the LuT comment.  I
have no data to prove it, but it does look like implementing a
compiler in a functional language is simpler because the problem of
compilation is expressing maps between data structures.

Compiler writing, a hobby for data structure junkies:

Within functional programming the typed vs. untyped choice is one of
much debate.  I picked a middle road here, using PLT Scheme, a dynamic
language with static aspirations.  This choice is definitely about
culture, but in my opinion not so terribly important.  The main
players that set aside Scheme in the land of dynamic languages are
hygienic macros and PLT's module system.

Using a stack machine as a machine model is quite a gamble because of
the incredible pressure of the C programming language, which fueled
the rise of RISC machines.  However, going against this culture makes
things simpler.  Maybe it's a false hope, but I have the idea that
once parallel programming becomes _culturally accepted_, machine
architectures will become simpler.  So, this choice is cultural or
better _counter-cultural_ .

Let's try to answer Frank's 5 questions:

   1. What problem does this language solve? How can I make it

As a representation language it contains an easily retargetable
machine model that supports a simple metaprogramming framework in the
form of quasiquotation, combined with a binding form for target code
parttern matching.

   2. How can I show it solves this problem? How can I make it

A subjective statement: partial evaluation for concatenative language
(string rewriting) is simpler to understand than lambda expression
reduction (tree reduction rules).

   3. Is there another solution? Do other languages solve this
      problem? How? What are the advantages of my solution? of their
      solution? What are the disadvantages of my solution? of their

Yes.  Metaprogramming based on (typed) lambda calculus: MetaOcaML.
The disadvantage of my solution is the lack of type system (which does
look like it is straightforward to add) and the forced use of a
point-free programming style.

   4. How can I show that my solution cannot be expressed in some
      other language? That is, what is the unique property of my
      language which is lacking in others which enables a solution?

It can, therefore it is a Scheme library.

   5. What parts of my language are essential to that unique property?

Scheme -> Coma quasiquotation, machine code pattern matching binding
construct, and semantics gueded by partial evaluation transformation

                             * * *

Both Forth and Lisp are firmly rooted in a culture of thought that you
can only appreciate once you went through the "aha-moment" yourself.
This statement reflects some of the smugness present in these
programmer cultures.  A smugness that deters many people new to the
paradigms.  But from my experience (I come from a C/assembly
background) I can testify that after this aha moment, you can't look
at things in the same way as before.

So for those that had the aha for both Forth and Lisp, I don't think
much justification for Staapl is necessary.  You are probably not
interested, because you're writing your own abstraction, right this

The rest of this article is for the other audience I try to reach:
embedded engineers that do not know Forth or Lisp and their relation
to macros and staging.

                             * * *

The problem with abstraction knowledge is that it is difficult to
bring up the motivation to aquire it when you already have a method
that is ``good enough''.  It's certainly not a good thing to make
things too abstract.


But sometimes the smart thing to do is to move beyond the barriers of
the programming system, up the abstraction ladder.

The main point is: for complex code that requires lowlevel control in
such a way that excludes the use of higher level languages directly
(can't trust that compiler!), it might be a good idea to find an
intermediate solution between using a low or a high level language
approach exclusively.

Use a highlevel language to _generate_ code for a lowlevel one.  Don't
trust a generic compiler to generate good code for you, write a
specific one yourself by automating the high -> low translation of the
abstractions you need.  If you can ``manually compile'' a highlevel
solution description by expressing it in a low level programming
language, and perform the same kind of translation multiple times, you
can probably automate this process.

The good news is that writing a special purpose compiler for a
specific abstraction is quite a lot easier than trying to write a
compiler that optimally tries to compile any kind of program.
Additionally, this problem can be simplified by using a set of
primitive compiler components.

So, let's suppose you've decided you need a staging solution.  Why
would you choose Staapl?


I know I can't sell Lisp.  I know I can't sell Forth.  But I know that
if I want to build a framework for code generation (a.k.a. model based
design), I can't ignore the knowledge encoded in the design of the
Forth and Lisp languages.  To use either of both seems like a
limitation: Forth's over-concrete standard semantics doesn't use a
source representation that is abstract enough and Lisp's (or better
Scheme's) machine model is too high level.  The Joy language seems to
be the grain boundary between these two models.  Staapl is really
nothing more than a possible implementation of the bridge between a
simple machine model and code generation viewed in a functional
programming setting.

The Staapl approach, through being built on paradigms from Forth and
Lisp, brings together a large collection of design choices that formed
the Forth and Lisp languages, leading to a fairly optimal encoding
system.  The path is quite straightforward: start with Forth, a
language that is already good at efficiently encoding problems
occuring in embedded design, and and replace its compiler, usually
written in Forth, by a mechanism that provides an impedance match to
languages based on lambda abstraction.  This is done by viewing Forth
macros as functional combinators (the Joy language model).  These
macros provide an API point that separate a multi-target backend from
highlevel metaprogramming and code generation on top of an abstract


In staapl the impedance match between lambda abstractions and
combinators happens in two places.  The Scheme->Scat quasiquoting
mechanism allows the construction of parameterized _concatenative_

 (let ([a 123]
       [b (macro: +)])
   (macro: ',a ,b))          ;; results in (macro: '123 +)

while a pattern matching mechanism for target (stack)machine code
allows the use of lambda abstraction to build _primitive_ code

 (patterns (macro)
   (([qw a] [qw b] +) ([qw (+ a b)])))

In other words: Staapl removes some of the difficulty of having to use
a concatenative language at the higher abstraction layers, while
keeping the simple concatenative model as the basic composition
mechanism for metaprogramming.

The real advantage of Staapl is its structure as a machine model
(primitives + composition mechanism) combined with a programmable code
generator adapted to that model (composition of concatenative macros)
that is accessible from an applicative Scheme language through its
lexical scoping mechanism.

The simple 2-stack machine model allows simple re-targeting to any
machine architecture, with a slight bias towards the low-end ones,
where C might be sub-optimal from a code size perspective.  The overal
algebraic feel of combinator code is an asset for writing code
generators.  In short, stack languages are good for writing lowlevel
primitives.  Staapl embeds this stack model in Scheme, which is based
on the more standard function application model, and as such
interfaces easer to generic domain-specific models, without requiring
a specification in a stack language representation.

To conclude, the merits of Staapl summarized:

  * minimalistic machine model
  * simple staging framework (code = sequence instead of directed graph)
  * language promotes fine granularity (small functions / macros)
  * functional languages make source transformation easier
  * a bridge to lambda-based abstraction (quasiquoting + patterns)
  * once in scheme, use scheme's function+macro composition mechanisms