Sat Aug 7 08:54:11 CEST 2010

Models of dataflow

I know of 4 different ways of looking at dataflow:

* Functional dataflow (FRP): nodes are functions of time, functionally

* Channels and Processes (CSP): nodes are programs reading from and
  writing to channels.

* Synchronous state space models (SSM): functions transfer (state,input)
  into (next_state,output).

* The Observer pattern: objects get notified of state changes of other
  objects through a notify() method call.

In my current problem, the big issue seems to be to define the meaning
of an "event" and "state".  In FRP an event seems to be more of an
implementation issue; i.e. how to represent the functions that make up
the meaning.  In CSP, an event is very explict: a write operation that
triggers the "unlocking" of its corresponding read.  Each process can
have local state.  In SSMs there are no events, only data changes, but
there is a concept of "current state".  In the observer there are
explicit states and explicit events, but not necessarily parallelism
as in CSP.  The observer pattern can get messy, as it has very little
high level structure apart from message passing.

My question seems to be: Is an "event" necessarily "operational?",
meaning here: is it related to a state transition (or is it a state

On to the practical: I'm implementing a Ractive Network[1] in an
environment that includes stateful objects that follow the Observer
pattern. The approach I use is invalidation + lazy evaluation (I/LE).

  * input: write causes all nodes that recursively depend on the
    written node to be invalidated.

  * output: read recursively (re-)computes all nodes that have been

The main question for my application is: how much do we gain (and
loose) from using a reactive pattern vs. a more low-level and ad-hoc
observer approach?  As hinted in [2], the main issues are algorithm
complexity and granularity.  If the granularity is large, the
algorithm complexity might not really matter.

So..  Can we take the best of both worlds?  How can this be tied into
an Observer pattern in a correct (and efficient) way?

By introducing strict evaluation: a read transaction initiated after
invalidate caused by a write transaction.

In the I/LE implementation we still have events in the true OO sense:
node invalidation.  The trick is to propagate them correctly from
network inputs to network outputs.  Inside the network we have the
benefits of the I/LE model (dependency management + linear evaluation
complexity), outside we have the benefit of Observer: a clear (strict
not lazy) event semantics.

Some more remarks..

  * About Pure Data.  The Pd design has hot/cold inlets to "be done"
    with synchronization problems.  Using it is not always easy (the
    trigger object) but it does have a very simple meaning: a patch is
    a sequential program.

  * Strict semantics seems to mesh a lot better with OO design.  In
    the I/LE model, it seems best to have every write trigger a read,
    such that there is a direct path from write -> event handler.  The
    2-phase algorithm is still useful to avoid exponential complexity,
    but the lazy semantics is too hard to keep right when it's used in
    an imperative environment, by people used to OO programming.

[1] http://en.wikipedia.org/wiki/Reactive_programming
[2] http://en.wikipedia.org/wiki/Reactive_programming#Similarities_with_Observer_pattern