Thu Jun 26 19:22:43 CEST 2008

Snowcrash encoding

The basic algorithm uses a single LFSR (linear feedback shift
register) to construct a 2D grid by:

  * partitioning X and Y state space into 2 separate segments so the
    same difference equation of degree D can be used for both.

  * shifting each column by a constant number N, and performing an XOR
    between X and Y

Let's call the observations G[i], and the hidden sequences X[i] and
Y[i]. These use addressing A(x,y) = A[x+Ny]. Alternatively, primed
array names represent transposed addressing A(x,y) = A'[xN + y].

The equations from LFSR state dependency and output mixing:

X[D+i]  = f( X[i] ... X[D-1+i] )     0 <= i < (N^2 - D)
Y'[D+i] = f( Y'[i] ... Y'[D-1+i] )   0 <= i < (N^2 - D)
G[i]    = X[i] + Y[i]                0 <= i < (N^2)
                                              3N^2 - 2D

The number of unknowns (X[i] and Y[i]) is 2N^2
( The input data consists of N^2 points. )

With independent equations, this means it is soluble if 
2N^2 <= 3N^2 - 2D or

                  D <= N^2 / 2

                  N >= sqrt(2D)

For D=15 -> N=6

I don't see why this should be 7 as in the current implementation.
Maybe it's used to reduce ambiguity? Something 'feels wrong' about
using a shift of 6 and a detection of 7x7 : the sequences don't simply
wrap around any more, which distroys some symmetry.

 Q: is it possible to solve the quations in least-squares form, by
    adding an error term to each?

 Q: is there a way to test orientation properties by using some
    property of the relationship of the LFSR to its inverse?

If the parameter space is known, it can be proven by enumeration that
ambiguity can always be resolved. Maybe that is really a better
approach, because I don't see a way to do it otherwise.