Sat May 8 11:13:15 EDT 2010

Data Parallel Haskell

Simon Peyton-Jones on Data Parallel Haskell[1].  See paper[2][3]:
Harnessing the Multicores: Nested Data Parallelism in Haskell, Simon
Peyton Jones, Roman Leshchinskiy, Gabriele Keller, Manuel MT
Chakravarty, Foundations of Software Technology and Theoretical
Computer Science (FSTTCS'08), Bangalore, December 2008.

  - 1000s of processors: you need data parallelism
  - flat datapar is not enough
  - nested datapar covers a larger set of algorithms

Key problem: handling nested dataparallel algorithms.

Key insight: if you've got the lifted version f^ (a vectorized version
of a function f, f^ = map f), you can implement the doubly lifted
version f^^ = map f^ in terms of it.  The basic idea is:

               f^^ = unconcat . f^ . concat

This is flattening.

Practically, the `unconcat' and `concat' functions do not generate
intermediate structure; they can be implemented in constant time
without copying.

Representation of an array needs to depend on the types of its
elements: data families.  This is where fast `concat' comes for, as
the _representation_ is concatenated.

For higher order functions defunctionalization is used, such that the
environments can be represented as tuples which reduces them to the
data case.

[1] http://www.youtube.com/watch?v=NWSZ4c9yqW8
[2] http://research.microsoft.com/en-us/um/people/simonpj/papers/ndp/index.htm
[3] http://research.microsoft.com/en-us/um/people/simonpj/papers/ndp/fsttcs2008.pdf