Sat Mar 27 17:42:24 CET 2010

Register allocation for image processing

Another one of those non-obvious obvious things: register allocation
can also be used to allocate larger buffers, i.e. for tile-based
image processing.

How does this relate to tiles?

What does this have to do with the dimisishing returns of fusion;
i.e. when the instruction cache fills up, or there is too much
register spilling in an inner loop?

( I forgot that these memory access patterns are really complicated!
They got me utterly confused before...  Maybe the point in writing DSP
metaprogramming code is to get _them_ under control in a formalism;
i.e. to create an algebra of access patterns. )