Fri Jul 31 17:19:27 CEST 2009
Delimited Continuations for Primitives
The idea is this: bridging C and Scheme can sometimes be tedious due
to the mismatch of data types. As long as the C side doesn't allocate
anything, the type matching can be largely automated. However, it is
often one wants to return a list of atoms. This requires explicit use
of the scheme list structure and its allocation mechanism inside C
primitives. This hinders automatic impedance matching.
C isn't such a bad language to program in as long as all data can be
allocated as local variables on the call stack. Reifying this as a
data object makes it possible to avoid containers altogether, by
representing them by traversal functions: generators for read-only
data, and zippers for read-write data.
This allows a single point of allocation: representing the C stack
(cursor state) as a data structure.
The problems to be solved seem to be:
- marking the stack frame in memory
- storing it somewhere
I just added the exception mechanism. What about turning this into
Ok. The core scheme is now extended with a 2nd way of aborting the
computation: returning a value after returning from a longjmp(), where
this value could be a representation of a copy of the C stack.
The extension library task.c implements that datastructure.
So, what are the problems?
- C code cannot hold on to Scheme references accross suspension.
References made inside the C context are not accessible to the
garbage collector, meaning that the referred objects might no
longer be alive on resume. This is actually a _good thing_ as it
allows decoupling of the memory models.
- Code needs to be instantiated in the same place for pointers to
make sense. This means it's not possible to use C -> scheme
calls. This problem can be solved by allocating resumable stack
segments on the heap in the first place, so they don't need to be
copied. The advantage is that C code that doesn't use
continuations runs fast.
- Return values are always a pair (ctx . rv) because we can't store
the rv in the ctx since there's no transitive mark().
What does this solve?
- Iteration state doesn't need to be manually wrapped in C or Scheme
objects: one size fits all.
I like this. Too bad it has some problems that can't be managed
automatically. The general idea is quite powerful though. I wonder
if re-arranging things so that this does work makes sense.
One change would be to jump to a C-stack on the heap whenever a
reset() is called. How to do that?
Another thing to investigate: how does tinyscheme use the C-stack?
There's an option that makes it more C-like.. (EDIT: no magic here).
An older version of glibc longjmp can be found here. I'm not sure
if there is a portable way to jump into a piece of malloc() memory.
From it looks like that at least for glibc, the stack pointer is at
a fixed offset in the struct.
#define JB_SP 4
Is this the same for all archs? Maybe look at uclibc too.
Apparently there are standard ways of using malloc() memory as stack
space. However the gnu doc says this is deprecated in posix due to
portability problems, and that posix threads should be used instead.
I don't see why.
So, how to use setjmp/longjmp instead? The trick is in assuming that
stack and heap are in the same address space: simply allocate a large
stack frame that will not be used but brings locals into the heap.
(EDIT: this doesn't seem to work so well since positions of local
variables on the stack are essentially unspecified. Even the
existence of the stack itself is never made explicit in the standard!)
The interface atom looks like this: a function that returns a lazy
list return either nil or a cons of an element and a continuation.
The continuation takes a single argument.
I've added some safety checks that make sure a context can only be
instantiated when it has the same base address.