Staapl

1 Introduction

Staapl is a macro assembler on steroids for PIC18F microcontrollers. It can be used for writing firmware optimized for size. It goes where C can’t go, and where assembler would be too complicated. It is based on Racket Scheme (code generator) and Forth (low-level language and machine model).

Staapl is hosted on Github. To install, install Racket and run the following command.

raco pkg install github://github.com/zwizwa/staapl/master

See the Racket package manager documentation for more information.

1.1 Background Ideas

The Scheme-metaprogrammable PIC18F Forth fits in a larger picture. Some loose ideas:

Staapl is an extension of the Racket programming language, a Scheme dialect with strong support for embedded domain specific languages (EDSL) written in terms of Scheme macros. A Staapl target language is a collection of target code generator macros.
The language for the Microchip PIC18 is a language based on on Forth, acting as a very thin abstraction layer over the PIC18 architecture. It has an optimizing peephole compiler, and a tethering mechanism with interactive console for running-target code updates through the PicKit2 programmer.
Unlike traditional Forth which is highly reflective, Staapl is built using macros with phase separation, which breaks the circular reflection into a directed language tower, creating a static semantics. The basic idea is that the meaning of the language doesn’t change inside one module, in contrast with original Forth’s approach. Most of Staapl is organized around a convenient abstract syntax based on a typical concatenative combinator language like Joy. This syntax can then be used as a base for different interpretations or as a target for concrete syntax frontends.
While the PIC18 language is a stack language, the basic code generator structure could support any kind of "algebraic" language, i.e. point-free languages without variable bindings. Note that point-free only refers to the basic composition mechanism. The core substrate of Staapl is still Scheme, so primitive macros can have local variables if that simplifies programming. This is used effectively in the Staapl Forth for PIC18.
The basic idea is that point-free languages are easily manipulated on the meta-level which makes them an ideal code generation target.
- Such languages have no variables that complicate code generation due to scope issues (hygiene).
- Thanks to the associativity of the operations of concatenation (syntactically) and function composition (semantically), computations are trivially partitioned in compile time and run time.
- Reliance on combinators for program structure (as opposed to explicit recursion and loops) allows for the derivation of program transformation laws that can be used for program derivation or compilation to efficient implementations.

1.2 Status

The phase I goal - a usable system for PIC18, be it with some rough edges - is reached, and the corresponding code is put in incremental improvement mode.
This base will serve as the basis for phase II: further experiments with point-free domain specific languages for DSP/control, static verification, and more elaborate compilation techniques to implement them.
At this time, Staapl is only practical for the PIC18 architecture. While writing other ports should be straightforward, it is a lot of work. I have some code for the 12 and 14 bit PIC cores, and I’ve been looking into porting staapl to PIC24/30/33 (dsPIC). The main hurdle is writing a peephole optimizer for the new architecture, and a more general assembler/disassembler architecture.

1.3 Projects

The PIC18 Forth language has been used in some art and education projects and workshops:

The Sheep sound synthesizer.
The ForthTV B/W composite video generator.
The CATkit and KRIkit boards.

1.4 Links

Some older documentation links below. Since this is a developer-oriented project with a lot of effort put into clean organization, I suggest looking at the source code to pierce through the top-layer of workflow glue. Staapl is structured as a library with multiple entry points, not a monolithic application. I encourage you to build your own workflow on top of it, and integrate it with other tools. (And share that code!).

An overview of the PIC18 macro Forth dialect.
A practical tutorial about interactive tethered Forth development.
A tutorial about writing a low-level Forth DSL using bottom up programming with procedures and macros.
Some background articles on the blog.
The excellent Racket documentation.
The mailing list.

2 Forth Compiler

To illustrate the general idea we’re going to use a concrete application: the Forth compiler for the 8-bit Microchip PIC18 microcontroller architecture. This section uses REPL interaction with some example code written on top of the compiler to demonstrate the code generation process.

(require pic18/demo)

2.1 Forth to Machine Code

The code> form provided by the demonstration module interprets the forms in its body as PIC18 Forth code, compiles them, and prints out the resulting intermediate code with one instruction per line.

> (code> 123)
[qw 123]

The instruction qw tells the target machine to load the number 123 on the run-time parameter stack. It is short for “quote word.” Providing a sequence of numbers in the code> body will generate concatenated machine code, which is executed by the target machine from top to bottom.

> (code> 1 2 3)
[qw 1]
[qw 2]
[qw 3]

Target code is represented in this intermediate form during the first code generation pass to facilitate code transformations. It consists of a mix of pseudo instructions and real PIC18 instructions. The code generator will eventually clean up all occurances of qw before attempting translation to binary machine code. The pic18> form performs this extra step and shows real machine code output.

> (pic18> 123)
[movwf PREINC0 0]
[movlw 123]

The first instruction stores the contents of the working register in the 2nd position on the parameter stack, and the second instruction replaces the contents of the working register with 123. Again, concatenating compiler input produces concatenated output:

> (pic18> 1 2 3)
[movwf PREINC0 0]
[movlw 1]
[movwf PREINC0 0]
[movlw 2]
[movwf PREINC0 0]
[movlw 3]

The intermediate instruction set which contains the qw instruction is useful for implementing partial evaluation rules. When compiling a particular Forth word, the compiler can inspect the code already compiled to determine if it can combine its effect with the effect to be compiled.

> (pic18> +)
[addwf POSTDEC0 0 0]

> (pic18> 1 +)
[addlw 1]

> (pic18> 1 2 +)
[movwf PREINC0 0]
[movlw (1 2 +)]

This illustrates 3 different modes of computation. The first program computes the addition at run-time, taking both input values from the runtime stack and putting back the result. The second program adds the literal value 1 to the top of the stack using a different machine instruction. The third program doesn’t perform any run-time computation at all and simply loads the result of the addition that was computed at comple-time because both inputs to the addition were available.

Note that in this last program the result of the compile-time computation is not shown. Instead it shows a program (1 2 +) that gives the result upon evaluation. The compiler doesn’t need to know the exact value at this point. It only needs to know that the value can be determined later when necessary. This is essential for integration with the assembler, since these expressions might contain symbolic representations of code locations that are not yet associated to numerical addresses.

Using the intermediate form with the qw pseudo-instructions to compile the Forth program 1 2 + shows the key idea: the target code list can be interpreted as a parameter stack, with the top of the stack at the bottom of the code list.

> (code> 1 2)
[qw 1]
[qw 2]

> (code> 1 2 +)
[qw (1 2 +)]

The code stack can be used as the argument passing mechanism for a language of macros that is active at compile time. Machine instructions then become datatypes of this language. The word + names a function that operates at compile time. It inspects the code stack and if it finds one or two qw objects it can use them as input to the addition operation and compile a simpler run-time instruction.

2.2 Scheme as Meta-language

While traditional Forth has its own metaprogramming facilities, combining it with a Scheme-based meta system gives the added advantage of tying into a powerful lexical scoping mechanism. The ability to assign local names to objects comes in handy when dealing with complex data structures, which is sometimes difficult to do in a combinator language like Forth or Joy. At the same time, the absence of lexical binding forms in Forth code make it very suitable to be handled as data, avoiding complications associated to name capture.

This sections introduces the forms macro:, compositions, patterns and tv: which comprise the metaprogramming interface between the Forth and Scheme languages in Staapl.

2.2.1 Code and Composition

A representation of a Forth program can be composed using the macro: form.

(define code1 (macro: 123))
(define code2 (macro: 1 +))

The objects created by macro: are called concatenative macros. These are functions that operate on compilation state objects.

> code1
#state->state

The target code associated to the objects code1 and code2 can be printed using the function state-print-code which extracts a code stack from a state object and prints it, and the function state:stack which produces a state object with an empty code stack to serve as an initial state.

(define (print-code code-rep)
(state-print-code (code-rep (state:stack))))

> (print-code code1)
[qw 123]
> (print-code code2)
[addlw 1]

It is seldom necessary to apply concatenative macros to state objects manually since composition using the macro: form will usually suffice. In practice such application is performed by the framework when generating target code.

The forms residing in the code body of a macro: form have a one-to-one correspondence to concatenative macros. Literals found in the body of a macro: are mapped to compilation state transformers that append qw instructions to the current code list. Identifiers are mapped to function values using the macro form, which fishes them out of the (macro) dictionary.

> (macro +)
#state->state

Note that the objects produced by (macro +) and (macro: +) are different, although they have the same behaviour. The former is a variable reference while the latter creates a new abstraction. This is similar to the distinction between the Scheme expressions + and (lambda (a b) (+ a b)).

Composing macros that are not in the (macro) dictionary is possible using the unquote operation. The macro: form behaves similar to quasiquote. Unquoted objects come from the Scheme lexical environment and are interpreted as macros.

(define code3 (macro: ,code1 ,code2))

> (print-code code3)
[qw (123 1 +)]

This illustrates the advantage of keeping target code in delayed form. The effect of both pieces of code has been combined into a single target operation.

It is possible to unquote Scheme values as literals by wrapping a quote form around the unquote form.

(define value 42)
(define code4 (macro: ',value))

> (print-code code4)
[qw 42]

To define new identifiers in a particular dictionary, we can use the compositions form.

(compositions (macro) macro:
(inc 1 +)
(dec 1 -))

> (print-code (macro: 100 inc))
[qw (100 1 +)]
> (print-code (macro: 100 dec))
[qw (100 1 -)]

The first two sub-forms in a compositions form indicate the target dictionary and body compiler respectively. The rest of the body consists of a list of lists, where the first element of each list is an identifier to which a macro will be associated in the dictionary, and the rest of the list is a code body that’s passed to the body compiler form.

2.2.2 Primitives

The previous section describes how to compose existing code to create new code by concatenation, and how to evaluate code into a form that can be passed to the assembler. This section will describe how to define primitive macros operating on stacks of target machine instructions.

Creating an instance of a machine code instruction is done using the op: form. It is exactly these objects that are produced when concatenative macros are evaluated.

(define ins1 (op: addwf 42 0 0))
(define ins2 (op: addlw 123))
(define ins3 (op: qw 123))

Real opcodes can be passed to the assembler to produce binary output. Pseudo instructions cannot.

> (op-apply ins1 0)
'(9258)
> (op-apply ins2 0)
'(3963)
> (op-apply ins3 0)
asm-pseudo-op: qw

However, pseudo instructions can be used to hold intermediate data during the compilation phase. The following will illustrate the use of a form to define new primitive operations. We’ll create a macro add that will behave like the macro + encountered before.

Creating new primtive macros is done with the patterns form. Its first subform specifies the dictionary to which definitions are associated. The rest of the forms contains lists of pattern and template pairs. The following example defines the add macro as not taking any input from the code stack, but producing an addwf instruction as output.

(patterns (macro)
((add) ([addwf POSTDEC0 0 0])))

The templates in a pattern form are lists of forms that are passed on to the op: form, resulting in lists of instruction objects. Verifying if it works gives:

> (print-code (macro: add))
[addwf POSTDEC0 0 0]

This is an example of a non-optimizing macro which only performs code generation. To add different behaviour for different input patterns, extra clauses can be added.

(patterns (macro)
(([qw a] add) ([addlw a]))
((add) ([addwf POSTDEC0 0 0])))

> (print-code (macro: 123 add))
[addlw 123]
> (print-code (macro: add))
[addwf POSTDEC0 0 0]

When a qw instruction appears in the input, it is deconstructed and its operand is used as the operand of a addlw operation. The patterns form is built on the PLT Scheme match form, deconstructing a stack of instructions according to input patterns, and constructing lists of instructions to be added to the compilation state to replace the matching top of stack.

Upto now our add doesn’t really perform compile-time computation other than selecting a different instruction based on the presence of literal data. Using the tv: form we can add proper computation when there are two literals available. For now, think of tv: as an RPN calculator behaving as in:

> (target-value->number (tv: 1 2 +))
3
> (let ([a 100]
[b 200])
(target-value->number (tv: a b +)))
300

The tv: form will install wrappers to enable computations that can only be performed after the assembler has assigned numerical addresses to code labels. The use of this compile-time calculator then leads to an implementation of the + macro whith 3 evaluation modes:

(patterns (macro)
  (([qw a] [qw b] add) ([qw (tv: a b +)]))
  (([qw a] add)        ([addlw a]))
  ((add)               ([addwf POSTDEC0 0 0])))

> (print-code (macro: 1 2 add))
[qw (1 2 +)]
> (print-code (macro: 123 add))
[addlw 123]
> (print-code (macro: add))
[addwf POSTDEC0 0 0]

2.3 Other RPN Languages

Staapl contains a generic concatenative language parser in staapl/rpn which is used to implement the languages macro:, tv:, scat: and target:. This language syntax can be extended with prefix parsers to implement a Forth-style prefix syntax for defining words.

For example the scat: form creates unary functions that operate on a state? object just like the macro: form. The state object on which they operate contains a single stack which can be accessed as a list with the state->stack function. This is the same stack which serves as the code stack for concatenative macros. The functions in the (scat) dictionary are thin wrappers around Scheme functions implementing the same behaviour but operating on the top of the state’s stack.

> (state->stack ((scat: 1 2) (state:stack)))
'(2 1)
> (state->stack ((scat: 1 2 +) (state:stack)))
'(3)

The form scat> is similar to the form code> in that it creates a concatenative function and applies it to a state with an empty stack before pretty-printing the result. The contents of the parameter stack is printed with the topmost element on the right side.

> (scat> 1 2 3 4)
<4> 1 2 3 4
> (scat> 1 2 3 4 + + +)
<1> 10
> (scat> 123 dup)
<2> 123 123
> (scat> '(a b c))
<1> (a b c)
> (scat> '(a b c) car)
<1> a
> (scat> '(a b c) uncons)
<2> a (b c)

The functions generated by scat: and macro: are compatible. However the macro: form creates functions that operate on tagged values representing target machine instructions.

> (scat> 123 ,(macro: 123))
<2> 123 (#<asm> 123)
> (scat> ,(macro: 123) dup)
<2> (#<asm> 123) (#<asm> 123)

2.4 Representation

Let’s open up the form macro: to see how it is implemented in terms of Scheme expressions. All RPN compilers are built around two forms: the definition compiler form rpn-lambda which builds lambda expressions out of tagged tokens and the dictionary compiler form rpn-parse which transforms token sequences into tagged token sequences according to the RPN language syntax.

To illustrate the implementation we use the pretty-expand function to perform macro expansion and pretty-printing with toplevel Scheme annotations like #%app removed. The function takes an optional second argument to indicate the kind of expansion. The default is full expansion using expand. In addition single expansion using expand-once is used.

The form (macro: 1 2 +) expands to an rpn-lambda form, which nests (folds) the subforms it receives.

(define stx1
  #'(rpn-lambda
     (macro-push 1)
     (macro-push 2)
     (scat-apply (macro +))))

> (pretty-expand stx1 expand-once)
(lambda (p)
  (macro-push 1 p (macro-push 2 p (scat-apply (macro +) p p))))
> (pretty-expand stx1)
(lambda (p)
  (let-values (((p) ((lit '1) p)))
    (let-values (((p) ((lit '2) p)))
      (let-values (((p) (macro/+ p))) p))))

The macro-push compiles a literal value while scat-apply compiles a function application. The macro form performs dictionary lookup by mapping the + identifier to macro/+ which is bound to the concatenative macro that compiles an addition operation. The lit function transforms a number into a concatenative macro that will compile the number. Let’s see them in action individually.

> (pretty-expand #'(macro-push 1 state next))
(let-values (((state) ((lit '1) state))) next)
> (pretty-expand #'(scat-apply macro/+ state next))
(let-values (((state) (macro/+ state))) next)

The rpn-parse form can be used to attach semantics to the core RPN syntax. Let’s illustrate it by creating a new RPN language represented by the program: form.

(define-syntax-rule (program: code ...)
  (rpn-parse (dictionary
              (resolve-name)
              function
              immediate
              quoted-program
              program:
              (rpn-lambda))
             code ...))
(define-syntax-rule (dictionary d) 'd)
(define stx2 #'(program: 123 add (foo bar) '(a b c)))

> (pretty-expand stx2)
'(rpn-lambda
  (immediate 123)
  (function (resolve-name add))
  (quoted-program (program: foo bar))
  (immediate `(a b c)))

The (rpn-lambda) subform in the definition of the program: form represents the initial content of the first dictionary element. The rpn-parse form will interpret the code forms it finds and append tagged forms to the current dictionary entry.

The expansion illustrates four syntactic structures in the RPN language. The 123 form represents a literal value compiled by the immediate form. The identifier add represents a function reference, with the name reference processed by the resolve-name form and the result of that compiled by function form. The (foo bar) form represents a quoted program compiled by the program: form and quoted by the quoted-program form. The '(1 2 3) form is a quote of a data structure which is a literal value handled by the immediate form.

The first subform of rpn-parse, here defined as dictionary, is the form to which rpn-parse expands, passing its dictionary output as subforms. We defined it here as a quote operation which will stop expansion of the dictionary form. By defining dictionary as something that will allow further expansion, and by defining the other forms passed to rpn-parse in the definition of program: we can introduce a semantics.

Suppose the RPN language we want to define uses a list as a state that’s passed from function to function to represent a stack. We can define a simple stack machine using these forms:

(define-syntax-rule (dictionary entry) (begin entry))
(define-syntax-rule (resolve-name n) n)
(define-syntax-rule (function fn state subform)
  (let ((state (fn state)))
    subform))
(define-syntax-rule (immediate expr state subform)
  (let ((state (cons expr state)))
    subform))
(define-syntax-rule (quoted-program . a) (immediate . a))
(define (add lst)
  (cons (+ (car lst)
           (cadr lst))
        (cddr lst)))

> (pretty-expand stx2)
(begin
  (lambda (p)
    (let-values (((p) (cons '123 p)))
      (let-values (((p) (add p)))
        (let-values (((p)
                      (cons
                       (begin
                         (lambda (p)
                           (let-values (((p) (foo p)))
                             (let-values (((p) (bar p))) p))))
                       p)))
          (let-values (((p) (cons '(a b c) p))) p))))))
> ((program: 1 2 3) '())
'(3 2 1)
> ((program: 1 2 add) '())
'(3)

The RPN languages macro:, target:, scat:, live:, tv: have the same syntactic structure with a different semantics attached by means of forms that are passed to rpn-parse in the definition of their program quotation forms. They all use dictionaries with a single element. Forth compilation however uses multiple entries. It uses extensions of rpn-parse to create new dictionary entries (i.e. the : word), representing named functionality.

2.5 Assembler

The assembler performs two tasks: converting symbolic representation of machine code to binary and determining code addresses. The former is relatively trivial. The staapl/asm module contains the special form instruction-set which implements a convenient syntax for creating assembler functions from strings representing bit field layout.

(instruction-set
(foo (a b c) "0101 aaaa bbbb bbbb" "1111 cccc cccc cccc"))
(define foo-op (op: foo 1 2 3))
(define start-address 0)

> foo-op
'(#<asm> 1 2 3)
> (for ((bin-op (in-list (op-apply foo-op start-address))))
(printf "HEX: ~x\n" bin-op))
HEX: 5102
HEX: f003

This defines the foo assembler instruction as a 2-word instruction with 5 bitfields: two constant bitfields and 3 variable ones. Whitespace in the strings are ignored, and can be used for nibble separation. Instructions defined using this form also define a corresponding disassembler function.

Target address allocation is non-trivial due to a circular dependency. Numerical constants can depend on numerical address values, while address values are determined by the size of instruction words, which can depend on the value of numerical constants (i.e. shorter instructions are used for small literal values or short relative jumps). To resolve this dependency, the assembler performs multiple passes using a monotonic relaxation algorithm. On every pass it re-evaluates expressions encoded in a target-value? object until code addresses stabilize. These expressions can be created by the compiler using the tv: form.

2.6 More

[To be documented].

At the core of the Forth code generator is an incremental compiler which constructs a control flow graph as its first output pass. Subsequent optimization passes operate on this structure.

The distribution contains some example Forth code and library routines that implement the target side of host to target tethering.

There is a significant body of code to perform run-time target access and incremental code compilation and upload. This functionality can be presented to the programmer as a Forth console, which can access host and target functionality.

Staapl provides the staaplc command line application which can be used to compile stand-alone Forth programs to binary, for upload with a microcontroller programmer tool.

3 PIC18 Forth

For the Forth language there exist ANSI and ISO standards. In good Forth tradition however, the Staapl Forth for the Microchip PIC18 is non-standard. It shouldn’t be particularly difficult to build a standard layer on top of it, so we will ignore that issue for now, and refer to our particular dialect as “the Forth”.

The Forth is based on a 2–stack machine model. One stack is used to pass data between subroutines while the other is used to keep track of procedure nesting and other control state. It is essential that they are independent, as opposed to languages like C which store parameters and return points on the same stack. For the PIC18, the model is a thin layer atop the concrete machine architecture: an 8-bit flash memory based microcontroller. The implementation actually uses three stacks: byte size parameter and retain stacks, and a 21–bit wide hardware return stack.

A Forth typically enables direct low–level machine resource access while providing a base for constructing high–level abstractions. Traditionally Forth is implemented in machine language. However, since the Staapl Forth doesn’t need to be self–hosting and so has no constraints about compiler complexity, it can be written almost entirely in macro form, instantiated by a meta–system running on a workstation instead of a small target system. Staapl’s Forth is subroutine threaded, with each Forth word corresponding to a piece of inlined machine code or a procedure call. See the introductory section for more information about how this works.

It should be noted that the Forth language layer is written on top of the concatenative macro layer Coma. Essentially it provides two extra features: Forth-style syntax in the form of prefix parsing words (like : and variable) that can manipulate identifiers by modifying the meaning of subsequent words, and target code instantiation which allows the result of macro expansion to be invoked at runtime as a machine code procedure.

In the following we use the form forth> from the demonstration module, which is similar to the forms code> and pic18> in that it compiles and prints code, but without any simplifications. It is different in two respects. It takes an input string which is passed to the Forth lexer, and it prints out a representation of a control flow graph which takes the form of labeled assembly code, which each basic block starting with a label.

> (forth> ": foo 1 2 3 ;")
foo: 0022 0044 6EEC [movwf PREINC0 0]
0023 0046 0E01 [movlw 1]
0024 0048 6EEC [movwf PREINC0 0]
0025 004A 0E02 [movlw 2]
0026 004C 6EEC [movwf PREINC0 0]
0027 004E 0C03 [retlw 3]

In addition to the assembly code, it prints two columns. The first one is the target Flash ROM address of the instruction, and the second is the instruction’s binary encoding.

3.1 Syntax

The Coma language on which the Forth is based uses an s-expression syntax as it is implemented using Scheme macros. A parser converts Forth code into s-expressions. This parser is based on the rpn-parse Scheme form we encountered before, and behaves similarly to Forth parsing words (those that use accept). The rpn-parser parser form and its exptensions translate a linear token stream into an s-expression representation of a Forth-style dictionary data structure. The token stream passed to rpn-parse is produced by a Forth-style word/whitespace lexer which is currently not extensible.

The dictionary is the same as we encountered in the single definition compilers like macro:. However, it has a different representation due to a different header structure. Also in Forth parsing, the dictionary typically contains more than one entry, with each one corresponding to a unique name.

Essentially, the dictionary abstracts two actions. Compilation using rpn-lambda as encountered before, and identifier definition using Scheme’s define form.

Compared to traditional Forth, the thing to note here is that the Staapl Forth is not circular in the sense that input parsing isn’t done by Forth words, but by separate entities called prefix parsers. These parsing words are extensions of the rpn-parse form, and implement behaviour associated to words like : and variabe. The main reason for this non–cyclic organization is to keep the Coma core as simple as possible so it can can support languages other than Forth using different preprocessing steps. A simple, non–reflective core language is easier to meta–program. However, breaking these reflective circles looses some flexibility. Extending the parser (or lexer) requires the use of a different language. The upside is that the language for rpn-parse extensions is a simple high–level rewrite language based on Scheme’s rewrite macro system. In practice defining new prefix parsing words isn’t often necessary: manipulating target names (generating collections of words as a module) is better done in Scheme.

To add some visual to this we can have a look at the intermediate dictionary s-expression rpn-parse creates before it is expanded further, and see how it is expanded into Scheme code. We did a similar thing in the program: form defined in the introduction. The last generated dictionary form is logged for debugging purposes in the forth-dictionary-log parameter.

> (pretty-print (forth-dictionary-log))
#<procedure:void>

This shows a list of two forms, both are macro calls to forth-word. The first form is an empty definition with only a header. In the second form, the forms following the name foo represent a macro call to rpn-lambda which is the heart of the RPN to Scheme compilation. The forth-word form creates the name foo in the proper namespace and will use the wrapper function wrap-word to postprocess the result of the rpn-lambda expansion.

Note that the dictionary data structure looks quite like a typical Forth dictionary, with some header information describing the semantics of the code body, and the body a sequence of instructions. This reflects the typical structure of a Forth parser/compiler. The parser effectively executes source code from left to right. The prefix parser represented by the word : will take the next identifier from the token stream and create a new dictionary entry with the proper header forms that implement its semantics.

3.2 Instantiation

In contrast to standard Forth which supports only direct style low–level macros based on the literal, postpone and immediate words, the Staapl Forth uses a high–level pattern–based syntax for macro definition. As a result, the syntax for Forth words and macros is essentially the same. More specifically, there exists a syntax subset for which compositions can either be implemented as meta–system macros or as target words.

Forth words are defined using the :forth prefix parsing word, while macros are defined using the :macro word. Both words behave as the standard Forth : word. Alternatively the forth and macro words can be used to switch the meaning of the : word to mean :forth or :macro respectively.

From here on we’ll use a more direct interface to the PIC18 compiler. Code represented as

macro

: add + ;

forth

: plusone 1 add ;

will be passed as a single string to the forth-compile form. It accepts a string of Forth code and updates the code registry. We’ll access the registry using the code function:

(define (code)
(code-print)
(code-clear!))

> (code)
plusone:
0028 0050 0F01 [addlw 1]
0029 0052 0012 [return 0]

This is essentially how the compiler works in actual use, except that code will not be prettyprinted, but exported as a binary file and an associated symbolic dictionary.

In the Forth code above the word add was defined as a macro, while plusone was defined as a target word. In the output assembly code it can be seen that all references to add have disappeared. Only its effect remains: the compilation of an addlw instruction.

Using these language constructs, the Forth programmer can decide which compositions are to be present as run–time callable code on the target machine, and which are always inlined. Whether one or the other is better depends on the context. Staapl’s Forth allows one to make these decisions manually, with a convenient way to switch between the two forms. Let’s define add as a target word instead.

forth

: add + ;

: plusone 1 add ;

> (code)
add: 002A 0054 24ED [addwf POSTDEC0 0 0]
002B 0056 0012 [return 0]
plusone:
002C 0058 6EEC [movwf PREINC0 0]
002D 005A 0E01 [movlw 1]
002E 005C D7FB [jsr 1 add]

The rule of thumb is that by default one creates Forth words as this often leads to smaller code and is easier to debug, except when one wants to use syntactic features not available to Forth words (like lexical parameters for macros), or when macros yield smaller and/or faster code because of specialization. The machine’s primitives are by default implemented as macros that can perform inline code specialization.

The Forth language in Staapl is inteded as the lowest level system programming layer. It is essentially an assembler with a powerful macro system. Other languages with a higher level semantics might be built on top of Coma, but the Forth provides the manual low–level access when it is needed.

3.3 Macro vs. Forth

The differences between macros and forth procedures lie in their composition system. Target procedures use the machine instruction pointer and allow manipulation of the hardware return stack, while macros are essentially lexically scoped Scheme procedures.

Target words allow multiple entry points with code fallthrough. This is specified in source code by not terminating a word definition with ; such that execution continues with the following word. Target names behave essentially as assembler labels. Sequential code in Forth source leads to sequential machine code. Macro words support only single entry points, as they are separate entities without an order relation. Macro words always need to be terminated by a ; word.

forth

: plustwo 1 +

: plusone 1 + ;

> (code)
plustwo:
002F 005E 0F01 [addlw 1]
plusone:
0030 0060 0F01 [addlw 1]
0031 0062 0012 [return 0]

Target words allow multiple exit points. Macro words simulate this feature by jumping to the end of the macro expansion for each occurance of the ; word.

Macro words allow the definition of lexical variables using the | word. These variables are taken from the compile–time value stack and behave as constants: occurence of the variable in code pushes the referred value. Target words do not support this feature.

macro

: 3dup | x | x x x ;

forth

: example 123 3dup ;

> (code)
example:
0032 0064 6EEC [movwf PREINC0 0]
0033 0066 0E7B [movlw 123]
0034 0068 6EEC [movwf PREINC0 0]
0035 006A 0E7B [movlw 123]
0036 006C 6EEC [movwf PREINC0 0]
0037 006E 0C7B [retlw 123]

Note that lexical names can be introduced anywhere in a macro body. They are valid until the end of the macro definition.

Lexical variables can be used to create macro closures. The words [ and ] can be used to create an anonymous macro that can be passed around as a compile time value. The word i invokes such a quoted macro.

macro

: make-twotimes | x |

[ x x ] ; \ create a nameless macro

: invoke-add

i + ;

forth

: example

10 make-twotimes \ create the macro closure

invoke-add ; \ pass it to another macro

> (code)
example:
0038 0070 6EEC [movwf PREINC0 0]
0039 0072 0C14 [retlw (10 10 +)]

The target run–time procedure composition mechanism is available to macros. This is useful for writing macros that abstract non–standard control flow. The exit word always means procedure exit, while ; means procedure exit in Forth words, but designates macro exit for macro words.

Another important distinction has to be made regarding arithmetic and logic operations and numeric representation. Once instantiated, these are limited to the target word size. During compile time computations however, infinite precision Scheme bignums are used.

forth

: one 1000000000000

999999999999 - ;

> (code)
one: 003A 0074 6EEC [movwf PREINC0 0]
003B 0076 0C01 [retlw (1000000000000 999999999999 -)]

Compare the opcode with a previous one and note that it really returns the 8-bit value 1.

3.4 Modules

Forth code can be organized in modules. These are implemented as PLT Scheme modules to make it easier to mix Scheme and Forth code in the same project. Scheme modules for PIC18 usually contain macros that expand to a pic18-begin form.

To import a Forth module in a Forth project use the require, planet or staapl words. This example imports the route macro and its supporting target code which is used in the subsequent examples.

require staapl/pic18/route

> (code)
_route/e:
003C 0078 60EF [cpfslt INDF0 0]
003D 007A 6EEF [movwf INDF0 0]
003E 007C 50ED [drop]
_route: 003F 007E 44E8 [rlncf WREG 0 0]
0040 0080 26FD [addwf TOSL 1 0]
0041 0082 0B01 [andlw 1]
0042 0084 22FE [addwfc TOSH 1 0]
0043 0086 50ED [drop]
0044 0088 0012 [return 0]

This is the same as

staapl pic18/route

Modules will only be instantiated once, registering target code in a central repository.

3.5 Byte cells

Standard Forth needs a cell size of at least 16 bits. The Staapl Forth uses 8–bit cells for data and retain stack, and uses a separate wide stack for holding return addresses.

Keeping the cell size equal to the native data word size makes it simpler to directly represent the machine architecture as a stack machine through a collection of macros that map to simple instructions. The small cell size isn’t much of a problem for simple computing tasks. However, target code addresses do not fit in a single cell which leads to some non–standard behaviour. Most notably the execute word now takes two parameters, taking the high byte of the word address on the top of the parameter stack. Another notable point is that loops using for next are limited to 256 iterations.

The fact that a word address does not fit in a variable or a single stack cell promotes the use of byte tokens for representing delayed behaviour, which are implemented in terms of jump tables using the route word.

forth

: one 1 ;

: two 2 ;

: three 3 ;

: four 4 ;

: interpret-byte

route

one . two . three . four ;

> (code)
one: 0045 008A 6EEC [movwf PREINC0 0]
0046 008C 0C01 [retlw 1]
two: 0047 008E 6EEC [movwf PREINC0 0]
0048 0090 0C02 [retlw 2]
three: 0049 0092 6EEC [movwf PREINC0 0]
004A 0094 0C03 [retlw 3]
four: 004B 0096 6EEC [movwf PREINC0 0]
004C 0098 0C04 [retlw 4]
interpret-byte:
004D 009A DFF1 [jsr 0 _route]
.L5: 004E 009C D7F6 [jsr 1 one]
004F 009E D7F7 [jsr 1 two]
0050 00A0 D7F8 [jsr 1 three]
0051 00A2 D7F9 [jsr 1 four]

The . word is the same as exit except that it does not mark the subsequent code as unreachable. The reason this abstraction works is that a procedure call followed by a procedure exit is always compiled as a jump. This means the Staapl Forth is tail-recursion safe.

4 Reference

(require staapl/macro)

syntax
(scat: rpn-form ...)

Most basic RPN form with the following grammar

rpn-form	=	id
	\|	literal
	\|	(rpn-form ...)
	\|	(quote rpn-data-form)
	\|	(unquote scheme-form)

rpn-data-form		=		scheme-data-form
		\|		(unquote scheme-form)

Here literal is any form that is not an identifer or a list, form is any Scheme form and scheme-form a Scheme form expanding to a state transformer function.

syntax
(macro: word ...)

syntax
(op: word ...)

syntax
(tv: word ...)

Creates a composite function similar to scat: except that it creates a target-value? which when forced, evaluates the composite function on a state? with an empty stack. If this evaluation produces a one-element stack this value is returned. Otherwise an exception is raised. Also, identifiers in the subforms word are interpreted as quoted target-value? if they are bound in the lexical environment of the form. Otherwise they come from the scat dictionary.

procedure
(target-value? tv) → boolean?
tv : any/c

A delayed computation that can be forced with target-value-eval or target-value->number. It may depend on the target-word-address of any target-word? it contains.

procedure
(target-word? tw) → boolean?
tw : any/c

A symbolic representation of a target code address. A node in a control flow graph. Assigned a value by the assembler. Can be used in target-value? expressions representing a numerical value.

syntax
(macro id)

syntax
(patterns dictionary (pattern template) ...)

syntax
(compositions dictionary composer (id word ...))

procedure
(state:stack) → state:stack?

Creates a state? object with an empty stack.

procedure
(target-value->number tv) → number?
tv : target-value?

procedure
(op-apply op org) → (listof number?)
op : op?
org : number?

procedure
(state->stack state) → list?
state : state?

procedure
(state-print-code state) → void?
state : state:stack?