Thu Jun 21 21:38:45 CEST 2007


the first thing to do is to create a generic unsigned multiplication,
and derive the other muls from that.

let's call 'z' a 8 bit shift (256)

we need to compute (x0 + x1 z) (y0 + y1 z)
all coefficients are 0 - 255

this gives

z^0   x0 y0
z^1   x1 y0
      x0 y1
z^2   x1 y1

the lowest of 4 bytes is unaffected by the 3 bottom ones
the second of 4                            top one

so, i'd like to do this
- fast
- functional

so no temp variables

the variables are presented as

  x0 x1 y0 y1

every number is used twice

now the juggling

done: i gave up on not using ram. it's probably possible to just use
the stacks, but it's really inconvenient due to the 'convolutive'
nature of multiplication. what i mean is: multiplication has all to
all datadependencies, and is not easily serialized. if it is
serialized, it needs random access (variable names) or at least
relative indexing. forth is not good at that.