Wed Jun 18 12:39:51 CEST 2008

about deforestation

Maybe his earlier work on lists is better suited to translate to my
problem: folding combinators for array processing.

I'm facing a gap between my understanding of theory, and a particular
practical problem. I should just try to solve one such problem with
higher order combinators to get a better view about concrete problems,
to get out of the muck of abstract confusion..

The problem is really one of datastructures and their iterators. In
FP, there are only nested arrays and some map and shift operators.

So.. Deforestation is about eliminating intermediate nested data
structures in a first order language with recursive pattern matching
for tree deconstruction.

Wadler defines the property 'treeless', which is used to construct
transformation rules to transform a composition of treeless functions
into a treeless function.

Definition: a term is treeless with respect to a set F of function
names if:

  * it is linear (each variable is used only once. this is to make
    sure transformations don't introduce redundant computations, and
    can be relaxed for integers)

  * it only contains functions in F (these are the 'exceptions' that
    will be expanded)

  * every argument of a function application and every selector in a
    case term is a variable. (obviously, otherwise there would be an
    intermediate tree result)

The algorithm that performs deforestation maps a linear term which
contains variables and functions with treeless definitions to a
treeless term and a possibly empty set of treeless definitions.

The core of the algorithm is the standard function 'inlining': replace
each function application with an inlined definition.

If i start to think from that point, shouldn't i get to something
simple? After all, i have no variable names to worry about, and no
destructuring or creation of run-time data structures. That is all
quite different.

So, question: what is the equivalent of deforestation in a
concatenative language with a simply managed array data structure and
a 'map' operator?

Let's write down some things that need to 'actually happen'

  * loop transformation to eliminate intermediate buffers
  * array memory management and reuse (linearity?)
  * dereferencing indirect addressing (on PIC18)

Dereferencing indirect addressing is a particular problem i ran into
writing highly specialized DSP code for microcontroller. It's a fairly
extreme level of templating which makes sense due to very limited
indirect addressing and multiplication on PIC18.

There is a difference beteween translating 'map f l' to (cons (f l1)
...) and making sure the same operation happens in place, or with a
fast reuse array.

There's something in Wadler's paper about
instantiate/unfold/simplify/fold, found in Burstall and Darlington: "A
transformation system for developing recursive programs."