Fri Nov 30 21:07:31 CET 2007


i had some bottom up code (what can be done efficiently) using
8x8->16 unsigned multiplication and 24bit accumulation. this works
well for rectangular windows, but not so much for non-rectangular.

maybe rectangular is enough since we don't have interfering signals?
anyway, it might be wise to look at how to do a windowed one..

i guess the idea is like this: make the window obey some kind of
average property that can be removed using maybe a separate
accumulation of the signal.

it doesn't look that hard: ** is inner product

[ s(t) + s_0 ] ** [ w(t) + w_0 ]

so there are 3 correction terms:

   s(t) ** w_0  == 0
   w(t) ** s_0  == 0
   w_0 ** s_0

which requires the average signal s_0 as the only variable component,
which needs to be scaled with the window DC component (can be 2^...)
and a fixed offset.

so i can basicly use the same unsigned core routine for general
complex FIR filters: renamed the macros to mac-u8xu8.f, and added