Sun Feb 17 19:44:33 CET 2013

Python-style coroutines in C

One of the patterns that came up a lot in a project I worked on
recently is state machines that parse "finitely nested" data

What I mean is that in order to parse data structures with unlimited
nesting, one generally needs a "task", i.e. something with an
execution (nesting) stack, i.e. a recursive decent parser.

However, in practice, many data structures have a finite structure,
i.e. lists of lists of things.  Those do not need a stack, just a
(finite collection of) state variable(s).

( From the human-understanding p.o.v. it surprises me that I've never
  seen this clear enough until recently.  Maybe my bias is too much
  towards arbitrary nesting depths, i.e. programming langauges? )

So what's the problem?

Suppose you're writing a parser for a nested data structure that needs
to be suspended, i.e. the data is not available all at once, but only
in small chunks.

Traditionally, there are 2 ways to do this in C:

  - Use a thread.
  - Use an explicitly coded state machine.

As long as there is a possibility to use a blocking mechanism such as
a pipe, a thread might be the best solution.  However, it is a bit
overkill, since we really do not need a full stack

Granted, there is no such thing as a "full stack" in C : each thread
is always just a state machine because the stacks are always finite.
But the point is about program structure, i.e. what can we do when we
know exactly the nesting dept at compile time?

So, we'd like to use a state machine.

Trouble with state machines is that they are hard to code, because
they often look like "manually inverted, nested for loops".

However, this "inversion" can be done automatically.

This is what most "coroutine" libraries in C are about: clever tricks
to "jump into the body of a function".

So what is the transformation?

- Move all the state variables into an object (i.e. no local
  variables!  either static variables or an explicit context struct)

- Abstract the "blocking points", i.e. the part around the "read"
  functions as a sequence:
  - save state
  - return to "OS"

- The "OS" then performs the read, i.e. stores memory into a buffer,
  or performs an externally blocking operation, and calls the function
  again with a block point.

Something like this:

struct vars {
    /* Continuation */
    void *buf;
    void size;
    void **next;

    /* Program variable context */
    int i, i_max;
    int j, j_max;

void parse(struct vars *x) {
    if (x->next) goto *(x->next);
    x->buf  = &x->i_max;
    x->size = sizeof(x->imax);
    x->next = &&read_1;
    for (x->i = 0; x->i_max; x->i++) {
        for (x->j = 0; x->j_max; x->j++) {
        // ...

The lucky thing is that the storage of the next state as a computed
goto is immediately followed by the label.  This means that no special
tricks are necessary to use the limited C preprocessor to generate the
following "call sequence"

          x->buf  = &buf;
          x->size = sizeof(buf);
          x->next = &&resume_point;

The __COUNTER__ token can be used to generate the labels.

Ok, so I've implemented it and put it online[1].

First comment I get is: "Tricky, powerful & dangerous." :)

Some people really don't like this.  Preferring "stupid" code to make
maintainability easier.  Fair point.  Still a neat hack though, if
used with the appropriate caution.

[1] http://zwizwa.be/git/shaco