Sun Jan 27 20:48:25 CET 2013

Duplicate circuit

The `feedback' thing is for later.  A more pressing need is a
`duplicate' function, where a piece of I/O processor is duplicated,
turning each (parameter) input into a vector, or implemented as a list
of parameters (this is a final binding issues, not a core issue).

The core issues are:

- How to syntactically specify duplication.

- How to implement?

The main problem seems to be: how to specify which inputs should be
duplicated, and which not?

At the lowest level, this translates to lifting the primitives.  When
a region can be isolated well enough such that the compiler can insert
a loop context, it creates the freedom to do it as such, and the core
language doesn't need to care, i.e. it becomes a type issue.

If it is a type issue, there is only one place where this needs to be
specified: the summation operator!

Let's see what this looks like.

The sum operator should be a bus operator, i.e. multiple values are
summed individually.

Practically: take out `in' and replace it with `bus', since it seems
to be the same kind of functionality.

   (lambda (freq)
     (bus (phasor freq 0 1)))))

This comes out:

 (lambda (v0 freq)
   (let*-values (((v1) (p_add v0 freq))
                 ((v2) (p_wrap v1 0 1))
                 ((v3) (p_sum v0)))
     (values v2 v3)))
 '((v3 . a) (v0 B a) (freq B a) (v2 B a) (v1 B a))) 

All variables are typed (B a) exept for v3 which is a.

This is correct, but it is too flat.  All "bussed" operations are
mixed with scalar ones.

However... it can be kept as it's not totally bonkers.

An advantage is that just expanding the "vector" primitives into a
sequence of directly bound scalar primitives breaks data dependency
and exposes SIMD parallellism[1].

If I rush over this and implement it in the backend using "lifted
primives", and add the same lifting operation in the Scheme
primitives, how hard will it be to do it properly and expose the loop

I.e. (bus (fn A1 A2)) would be something like 

(bus ((a1 (in A2))
      (b1 (in B2)))
  (fn a1 b1))

which will clearly delineate the "plug point" to insert an iteration

Yeah I don't feel good about using such a naked hack that's hard to
recover..  It can be done syntactically by separating out the free
references and adding a barrier.  However, that could be used as a
shorthand later.  For now it's probably best to start with an explicit
binding form.

Let's write it in terms of map and fold/s 

  (fold/s + (map (lambda (a1 a2) (fn a1 a2)) A1 A2))

Here the "/s" for a symmetric fold t t -> t that doesn't need an
initializer.  There is little need for an a-symmetric fold in

Chances are that folding these two operations into a single fold-map
is better.

But how to represent?

The types are already ok.  What is needed is a little syntactic marker
that a loop is being performed.  The types of the operands inside the
loop are known, so it is known which need to be indexed and which not,
but of course the sub-structure is only visible inside the loop.

(let* ((v1 ...)
       (v2 ...)
       (v3 ...)
       (v4 (map/fold + 
             (let* ((v5 ...)
                    (v6 ...))
                 (values ...))))
       (v7 ...))
    (values ...))

For C/Pd representation, the loop can just be unrolled and the
parameters flattened (individually named).  The Scheme representation
can use proper deep data structures.

So we're back to square one: this needs a local loop structure.  The
thing is that implementing this special case maybe allows later
generalization to a single siso loop structure.  map/fold is sis (no
output, just state).

So maybe we should just do it correctly.  

- Loops are a nesting structure that can be inserted in a normal SSA

- The variables defined inside such a nested structure are local to
  the loop, but they can be bound to external outputs.

- Two syntactic tricks are necessary: some kind of locality to delimit
  the iteration body, and a "blessing" of some variables as composite, i.e.

  (loop (X Y Z) (fn X Y Z a b c))

  -> here the capitals are blessed as streams (will be indexed inside
     the loop) and the lower caps are constants.

- The only thing that can leave such a nested structure is the output
  written during a loop ( stream ), and the final state ( scalar ).
  I.e. a loop returns a number of scalar and stream variables: t t t
  (B t) (B t) ( for summation, this is just the state variable. )

[1] http://gcc.gnu.org/projects/tree-ssa/vectorization.html