The big lesson from implememnting this in C, is that the inside of a task really doesn't matter. It is about messages and syncrhonization. The rendezvous works particularly well in hardware. I still miss a good way to go from parallell to sequential. The core idea is to parameterize a "CPU stripper".