Fri Jan 24 07:23:24 EST 2020

Pipeliner: dataflow vs. state machine

So it seems quite obvious.  Pipelining is simple: take a data flow
graph and start adding delays.  It's probably best to formulate it as
a problem that can be solved using a SMT solver.

There is one caveat: feedback.

So for state machines it is really not all that obvious.

It might be good to implement a simple machine like that and see how
it goes.  E.g. a pipelined counter, where the adder is pipelined.

The ground rule is: a pipelined state machine will have a slower state

It also seems that utilization will be low if functional units do not
share a datapath.

Conclusion: pipelined state machines need a programmable datapath.

To re-iterate: this is an important insight.  Easy to see when two
regimes are put next to each other:

- Feed-forward dataflow can be pipelined without consequences: all
  function segments are utilized at max rate, providing throughput,
  but at the expense of some I/O delay which is often not an issue.

- If such a functional network is used as the update function of a
  state machine, the delay is very significiant: for an N-stage
  pipeline, the utilization is only 1/N.  In this case it is probably
  more effective to collapse the different pipeline stages into one
  programmable datapath.  The total machine's update rate will still
  be N times slower, but silicon usage might be better due to reuse.

A CPU is the natural end stage of a pipelined state machine, where we
create a programmable datapath that is universal.