[<<][rtl][>>][..]Sat May 4 08:09:59 EDT 2019

So the idea is to embrace the array nature and allow for "full history" in the loops, then optimize out: - intermediate arrays that are only used element-wise - translate triangles to accumulators Do this in two passes: one that generates only the normal form, and one that can perform the optimizations. This means the zipfold form has to change, as back-references of outputs are allowed, and full references of inputs are allowed as well. So the language I'm writing is an array language with "triangle feedback" of the outputs. This way there needs to be no distinction between accumulators and outputs, as each output can be used as an accumulator, as long as the index referencing stays within the triangle. It seems that representation of this at Term level would be straightforward. It needs only arrays and loops that range over an index. The output of a single iteration is a collection of scalars. The input is a collection of arrays. - Some of these are original inputs, and are full scale. - Some are outputs computed in the previous iteration. So this needs the representation of an array, probably as a function. Some cases: - one loop incrementally computes an array - an other loop can be run to perform a computation on that whole array So it can't be just "global" variables. Scope really needs to be local. Storage can be reused though as long as outputs are not propagated. This isn't a simple problem. But good to realize that a more general view (triangles instead of accumulators) is better. So again: - at any "step", you can compute one scalar value from a number of other scalar values (n-op) - the place where this value is allocated, i.e. the hole, is what we are managing. - at each "step", we also exactly know the valid range of the input arrays. - based on context, holes can be re-used in subsequent iterations, but this is an optimization. in first iteration, allocate everything for single-assignment. then later on, "project" the variables, where each projection eliminates a dimension. - start with a setup where all loops have a fixed size, then generalize to computed fixed sizes (e.g. allocate intermediates on stack), then generalize to data-based iteration sizes with just an upper bound on the allocation. The idea is still the same as RAI, only RAI was too eager with optimizing out intermediates. Representing this is going to be a challenge. Term can probably be used because it has parameterized nodes. What we add is: - loop variable contexts - context dependent nodes for value binding - primitives for dereferencing So, where to start? Structurally, the first thing that needs to happen is to introduce context nesting. Once that is possible, the rest will becomes clear.

[Reply][About]

[<<][rtl][>>][..]