DSP related applied math ramblings. This is a collection of misc notes about applied math, algorithms and circuits, somewhat related to programs I'm writing and projects I'm working on, but deserving a place of their own. For more http://zwizwa.be/ramblings Entry: Killing Chipmunks with Parametric Dynamic Wavetable Synthesis Date: Mon Aug 4 11:55:00 CEST 2003 Type: tex \begin{abstract} In this paper we discuss some sound synthesis methods in the dynamic wavetable framework. We will take a closer look at approaches that combat problems regardig aliasing and pitch dependent formant locations, two of the major drawbacks of dynamic wavetable synthesis (DWS). DWS is a hibrid form of wavetable synthesis and additive/granular synthesis. We will limit ourselves to ``holistic'' DWS, which uses (semi) parametric methods to cancel pitch dependent formant location, as an alternative to basic additive or granular synthesis for signals with a clear single pitch. \end{abstract} \section{Introduction} Wavetable synthesis is one of the first digital sound synthesis methods developped [refs]. The method stores a period of a sound wave in a table and plays it back at a variable pitch. The data in the table determines the timbre of the resulting waveform. A drawback of this form of synthesis is the fact that formant frequencies of the resulting waveform are proportional to the playback frequency. This is frequently called the ``chipmunk effect'' and is usually undesirable. In commercial synthesizers this artifact is reduced by using several wavetables for some pitch ranges and interpolating between them. This is referred to as multitimbral synthesis. Another drawback of wavetable synthesis is aliasing in the case of polynomial interpolation. This is solved in practice by using a decimating filter. This is also called bandlimited interpolation. This usually requires the storage of a large set of interpolation coefficients with different cutoff frequencies, since it is hard to impossible to do this in a parametric way. Dynamic wavetable synthesis is a derivative of wavetable synthesis, with the addition that the wavetable is synthesised using some (parametric) method, often updated at a haptic rate of around 20Hz. This enables the construction of dynamic timbres. However, being a table playback method, the chipmunk effect is also present in this method and aliasing needs to be taken into account when the table is read out at a higher rate than the spacing between table samples. In this paper we will illustrate some applications of DWS, with special attention on the chipmunk effect and aliasing. The first method deals with explicit additive synthesis using the fast fourier transform. In short we synthesize the spectrum of the periodic waveform directly, and introduce pitch by resampling the waveform on playback. When the pitch of the playback is known (i.e. by lookahead) the wavetable can be made explicitly bandlimited. If dilatation and contraction of the spectrum are parametric, we can use the playback pitch information to perform a formant correction before the waveform is transformd to the time domain. This approach has some advantages over synthesizing sinusoidal contributions with standard, direct FFT approaches such as $\text{FFT}^{-1}$ [2], especially for near periodicity. The second topic in this paper is an illustration of the general framework of scanned synthesis using dynamic wavetables. The idea is to derive a waveform, or timbre, from a (nonlinear) dynamic system. The holy grail here is to find some way to change the formant spectrum of a wavetable, depending on the desired playback pitch. We will persue this using circulant neural networks, as a generalization of a linear circular time invariant system. \section{DWS algorithm} \subsection{DWS} The algorithm is very simple. The wave table is a vector function $x[k] \in \mathbb{R}^N$, with a sampling rate $f_c$ which is a fraction of the system sample rate $f_s$. Sound is synthesized by interpolating the $x[k]$ at a desired pitch. To prevent discontinuities when the table is updated, the readout can be made to interpolate between $x[k-1]$ and $x[k]$. The choice of $N$ depends on the desired frequency resolution. When $N=2048$ and $f_s = 1024 f_k$ and $f_s$ in the order of $40$ kHz the entire audible spectrum can be represented, together with the interesting haptic frequencies. A choice of sampling subdivision and table length of $1024$ is usually more than enough for practical purposes. The computational cost in its basic form is relatively low. The playback resampling can be done using simple polynomial interpolation. Using more expensive methods like bandlimited interpolation is also possible, however for some uses (FDDWS) playback aliasing can be cancelled explicitly, so an expensive decimating filter is not necessary. Usually the cost of synthesizing the wavetable (using convolution methods) can be significantly reduced by using an FFT. \subsection{FDDWS} Since working in the frequency domain has a more intuitive feel, we will give the method where de table is transformed using an IFFT before resampling playback a different name: frequency domain dynamic wavetable synthesis (FDDWS). One of the advantages over direct additive synthesis methods is the elegance of the scheme. We don't have to care about things like window length and spacing, and window transfer function, which are serious drawbacks of standard, ad-hoc frequency domain processing methods. Since elegance is not necessarily an argument in engineering, we claim the real advantage is simplicity of the parametric synthesis phase, which will ease implementation and save computational cost when the expense of the spectrum synthesis far outweighs the cost of the resampling involved. When employing frequency domain DWS, anti-aliasing is a trivial multiplication before performing an IFFT. The only real problem that needs to be taken into account is to make the spectrum ``stretchable''. When killing the chipmunks is not necessary, since the effect might be desired in some cases, it can be simply left out. In short, by introducing separate pitch and formant controls, a lot is possible. Combatting chipminks with FDDWS is thus creative use of time/frequency contraction and dilatation. Requiring parametric spectrum dilatation or contraction might seem a serious disadvantage. We can also look at it as a trade-off for the added simplicity of te frequency domain processing. Since the transform lengths are fixed we are freed of window juggling. We hope the illustrations in the next sections wil clarify this simplicity. \subsection{Time-Frequency Duality} All in all, DWS enables us to do pitch relative processing using the entire dsp toolbox in a simplified form because the data size remains constant. The slight increase in cost and the straightforward trade-offs involving aliasing and formant locations give us a very flexible and intuitive framework as return of investment. When working in the complex field, we have a nice dual representation that allows us to switch between time and frequency representation without too much hassle, and interchange the dual operations of circular convolution and ring modulation. \section{Parametric Formant Preserving Wavetable Synthesis} We will illustrate this principle by explicitly synthesizing ``constant-Q'' formants. \section{Dynamical Systems} This section deals with the synthesis of tables using a parametric (nonlinear) dynamical system. This is in fact equivalent to Scanned Synthesis, pioneered by Max Mathews e.a. [ref]. In most cases we are interested in a periodic table without discontinuities. We could define the table on the complex unit circle, even in continuous form to enable some analytic manipulation using i.e. orthogonal polynomials or fractional linear maps, but we will leave that route for later. There are other ways to get to periodicity. We take the wavetable to be $x \in \mathbb{R}^N$. The system is defined as an iterated map \begin{equation} x[n+1] = f(x[n]). \label{ifs} \end{equation} We will also extend this equaiton to include an input, in the form of $x[n+1] = f(x[n] + u[n])$ or the more general $x[n+1] = f(x[n], u[n])$. For simplicity the output (the wavetable) is taken to be the state vector of the system. \subsection{Linear Systems} When we limit ourselves to linear maps $f = A \in \mathbb{R^{N \times N}}$ we get a ``standard'' linear state space system % dynwav paper [1]. A way to introduce circular symmetry in $f$ is to make $A$ circulant. Application of $A$ can then be computed in the frequency domain since it amounts to circular convolution. Working in the frequency domain will also enable us to interpret the working of the network to a large extent, since the FFT (denoted further as the matrix operator $F$) of the wavetable $Fx$ has a very intuitive interpretation. The operator $A' = FAF^-1$ is diagonal containing the spectrum of this filter. Requiring circular symmetry thus brings us to FDDWS. For the linear case (\ref{ifs}) reduces to \begin{equation} x[n+1] = A x[n]. \label{lifs} \end{equation} With $A$ circulant, $x' = Fx$ and the diagonal decomposition $C = FAF^-1$ we have a set of decoupled linear systems \begin{equation} x'_i[n+1] = c_i x'_i[n]. \label{dlifs} \end{equation} With $c_i$ an eigenvalue of $A$ corresponding to one of the eigenvectors of $A$ (an FFT base function). Because we defined $A$ to be real valued, each $c_i$ has a complex conjugate $\bar c_i$. We now have the choice to take $A \in \mathbb{C}^N$ or to replace the eigenvalue decomposition $C = FAF^-1$ by its real blockdiagonal decomposition. The effect of the system is to produce a wavetable composed of a set of harmonic sinusoids, whose phase and amplitude are modulated damped complex exponentials. Scanning $x$ will approximately produce a set of near harmonicly spaced damped sinusoids. The damping is affected by the magnitude of the $c_i$ and the frequency shift away from harmonicity is determined by the phase of the $c_i$. What we are interested in is to make the magnitude of the frequency spectrum of $x$ (the magnitude of $x'$) parametrizable, so we can cancel out the formant shift introduced by table readout. This would turn the DWS method into a direct, fully separated pitch and timbre synthesis method. This can be done by changing the time scale of the $x_i$ and $c_i$ whenever the playback speed varies. Doing so will move the aliasing problem to this scaling step. If the $c_i$ are fully parametric, this does not pose a real problem. Stretching the state vector $x_i'$ will pose no problem either, except for the loss of information. Compressing $x_i'$ does pose an aliasing problem. A way to solve this is to stretch the time domain state vector $x_i$, since frequency domain compression is equivalent to time domain stretching. All this means the entire algorithm will be very much based on resampling, both for the time and the frequency domain representation of the state vector as for the final playback resampling. This of course turns us back to square one in some sense as to how far it is worth persuing this approach instead of keeping the pitch and timbre method separated in a source/filter approach, i.e. using bandlimited impulse synthesis together with an IIR filter bank or overlap add frequency domain filtering. On the other hand, when persuing this route, we do have a separate pitch and formant control, together with a nice framework processing waveforms and spectra in a combined time/frequency approach. \subsection{Nonlinear Systems} In this section we will use a recurrent neural network with connectivity matrix $A \in \mathbb{R^{N \times N}}$ and squashing function $\sigma(t) \in \mathbb{R}$. We define $f$ as a composition of $x' = Ax$ and $f_i(x) = \sigma(x'_i)$. Like in the previous linear approach, we take $A$ to be circulant. This network is thus equivalent to a filter followed by a waveshaper, in music-dsp speak. \subsection{Examples} \subsubsection{Analog waveform modeling} It is of course debatable wether it is worth using an IFFT for waveform synthesis. However, since the transform is running at a haptic rate, the cost is proportional to $\log N$. Compared to filter based methods for synthesizing analog waveforms, this is not so expensive\footnote{ The filter order required to get a decent bandwidth vs. aliasing trade-off for virtual analog is significant, and at least comparable to the complexity of the IFFT. On the other hand, implementing a virtual analog synth is a lot easier to do with just an increase in sample rate. Increasing the sample rate has a lot more advantages. It simplifies building blocks, allows for a smaller wordlength, allows for low order filters to be used to tame nonlinear elements, allows for sigma-delta modulation to perform bit depth reduction when increased precision is necessary, i.e. for high Q recursive filters, etc...}. Since we're not too much concerned with killing chipmunks, the only thing we need is an anti-aliased IFFT. Most analog style waveforms have a simple fourier transform. Again this can lead to new schemes for synthesizing new ``analog sounding'' waveforms and is in fact no different from general parametric spectral synthesis. \subsubsection{Piano modeling} Modeling the piano is not a simple task. However, some of the features of the piano, like the inharmonicity and triple strings per note fit wel into our scheme. We will use the simplest form of a circulant linear state space system (FDDWS), with nothing but a spectral envelope and a phase shifter (multiplication with unit norm complex numbers) in the feedback path. Reading a table three times with frequencies corresponding to $1, 2+\Delta_2, 3+\Delta_3$ to simulate the effect of triple strings already produces a very piano-like sound. We can introduce inharmonicity by modulating the FFT bin corresponding to basis function $e^{\frac{j2\pi n}{N}}$ by $e^{j(an + bn^2)}$. This will produce a frequency dependent shift, which will shift ($a$) or bend ($b$) the harmonic series upward ($a > 0$ or $b>0$) or downward ($a<0$ or $b<0$) from harmonicity. This nonlinearity can even be made signal dependent, i.e. dependent on the power of the system. Modeling the attack is almost as important as the strings. We use a second FDDWS system with a gaussian envelope in the feedback path, reduced by an overall gain. The wavetable is initialized with a random spectral content when a new note is struck. \subsubsection{Waveform feedback} Since we know the pitch of the scanning playback, we can scan a waveform, do some processing (i.e. a reverb) and scan the output back into the table. This allows for some more coupling between the real world and our DWS model. Some nice feedback effects can be obtained when both scanning frequencies are not inverses, or a multiple read-out scan is combined with a single read-in scan. Generating a new waveform by stretching or compressing the previous one and feeding it back into the loop can generate a lot of interesting sounds, even natural sounding drum sounds. By adding a static limiter and a lowpass filter in the feedback loop, stability is ensured and some spectral control is possible. The lowpass filter also smooths the edges. The update equation is $x[k+1] = L \circ D \circ R \{x[k] \}$ where $R$ is a resampling stretch or compress operator, $D$ is a waveform limiter (i.e. a hard clipper) and $L$ is a lowpass filter. By windowing $x[k]$ with a cosine window before scanning it for final output, most of the waveform discontinuities re removed and the frequencies are shifted upward which gives extra punch. Driving the same system with a constant period input, instead of an impulsive exicitation, giving the update equation $x[k+1] = L \circ D \circ R \{x[k] + u \}$ , with $u$ the constant input, can generate interesting fractal like waveforms, reminicent of synced sawtooths. When the limiter is replaced by a power normalizer and the feedback loop contains both a lowpass and a highpass filter, $x[k+1] = N \circ L \circ D \circ R \{x[k]\}$, with $N$ the power normalization operator, a stable oscillator is constructed. This oscillator can produce a wide variety of chaotic sounds. Due to the normalization, the system is nonlinear but energy conserving, and due to the filters, the frequency content can be localized. Some more variants are possible. Changing the update function to $x[k+1] = N \circ L \circ D \circ (1 + aR)\{x[k]\}$, gives an extra control $a$ that is a tuning parameter for the amount of folding applied. \subsubsection{Random remarks} In waveguide string modeling, there are usually two polarizations that are combined, both with different properties. Using the positive and negative part of the spectrum to express something similar in DWS, enbles us to use complex FFT's. This is a bit of a moot point, but it enables easier processing in PD, since the data will be in a more natural form, plus we sort of get two signals for the price of one. In other words, if the positive and negative spectrum will receive a different phase shift (so both left and right partials differ slightly in frequency after scanning) we get amplitude modulation for free. \section{Conclusion} Despite the chipmunk effect which sometimes can be a big drawback, dynamic wavetable syntesis is an interesting approach because it enables us to use fast periodic transforms to build systems that can be easily understood as to how they behave from a time or frequency perspective. When we can counter the chipmunk effect by adjusting system parameters, it is a very useful technique because it eliminates some of the ad-hoc methods used to compensate for the changing periodicity in transform based techniques. In short, if the formant distribution of a wavetable can be shifted in a straightforward parametric way, we can cancel the chipmunk effect completely. Also, when working in the frequency domain, anti-aliasing is ``free'', in a sense that knowing the readout frequency enables us to explicitly cancel out the frequencies that are bound to fold. Entry: Entropy Date: Sun Oct 16 15:25:56 CEST 2005 Type: tex \def\gauss{e^{t^2 \over 2\sigma^2}} Entropy is the (logarithm of the) number of states of a system for which some \emph{macroscopic} measure is left invariant. The logarithm is there just so we can add entropies, instead of multiplying the number of possible states. % weighted sum? For example take 100 coins. If we describe the \emph{macrostate} by the number of heads, states with high entropy are those around 50, for which there are a huge number of possible \emph{microstates} leading to the same measure. States around 0 and 100 are low entropy, since there are only few possible such arrangements. Note that the definition of entropy depends on which measure(s) we are using to describe a macrostate. Quick, some airplane notes. Maximum entropy distribution for the interval $[0,1]$. Suppose the solution is $p(x) = 1$. The entropy is then given as $H_p = \int_0^1 1 \log 1 = 0$. We take another distribution $q(x) = 1 + \epsilon q'(x)$ with $\int_0^1 q'(x) = 0$. Using the linear approximation $\log(x) \approx 1 + x$, the linear approximation of the entropy is $H_q \approx - \int_0^1 (1 + \epsilon q'(x)) \epsilon q'(x) = - \int_0^1 \epsilon^2 q'(x)^2$, which is always less than $0$, so $p(x)$ has maximal entropy. Entry: The number e Date: Mon Oct 17 23:40:33 CEST 2005 Type: tex % WWOOOOOW! (hmm.. apparently this was once an orgastic insight ;) %% About $e$. I wonder if it is at all possible to define $e$ without the %% concept of limit or differentiation. I don't think so. The number $e$ is about the prototypical way of combining a huge number of very small updates. Starting from the power series of $e^x$ as definition, we can prove $$e^x = \lim_{n \to \infty} \left(1 + {x \over n} \right)^n,$$ using the binomial expansion $$e^x = \lim_{n \to \infty} \sum_{k=0}^n {x^k \over k!} {n! \over (n-k)!n^k}.$$ This gives a term by term correspondence since $$\lim_{n \to \infty} {n! \over (n-k)!n^k} = 1,$$ because $\lim_{n \to \infty} {n-m \over n} = 1$ for any $m$. It is interesting to note that in words this says that the number of ways to pick $k$ natural numbers is the same as the number of ways to pick $k$ \emph{different} natural numbers, in the sense that the ratio of both numbers tends to one as the set grows towards the size of the set of natural numbers. Very important is the fact that we can't arrive at $e$ without using infinities. If the converse were possible, $e$ would be algebraic, which it is not. $e$ serves as the gategeeper of the infinitesimal. %% Then onto commutators and $x$ and $p$. Suppose position and momentum %% operators are matrices $X$ and $P = FXF^H$, with $F$ denoting the %% Fourier transform. The commutator is %% $$[X,P] = XP - PX = XFXF^H - FXF^HX.$$ %% Does this say anything useful? Probably not. Entry: Dither Date: Fri May 5 01:00:51 CEST 2006 Type: tex Reading the ``Dither'' page on WikiPedia, and landed on Triangular PDF, which is equivalent to the roll of two dice. Of course, since the roll of one dice is rectangular, and the sum of two is the convolution of the PDFs. Reminds me about Knudth's Musing ``Hooray for probability theory'', which deals with polynomials with positive coefficients and only real zeros. They can be seen as the cumulant of $N$ coin tosses, and always have a gaussian shape. %[1] http://www-cs-faculty.stanford.edu/~knuth/musings.html Entry: Finite differences Date: Sat May 20 17:16:02 EDT 2006 Type: tex Reading ``Finite Difference Equations'' by Levy and Lessman. Interesting book. Opens my eyes about a lot of things related to the continuous and discrete cases. The one that struck me today is $$2^x = (1 + 1)^x = \sum_n \binom{x}{n} = \sum_n {x^{\nf} \over n!}.$$ Here $x^{\nf} = x(x-1)\ldots(x-n+1)$, the falling powers of $x$. This is a notation due to Knuth; the book uses $(x)_n$ which I find less clear. The expression akin to the Taylor expansion is $$f(x_0 + xh) = \sum_n {x^{\nf} \over n!} \Delta^n f(x_0),$$ where $\Delta^n f(x_0)$ is the $n$ times iterated difference operation $\Delta f(x) = f(x+h) - f(x)$. The sum is finite when $f$ is a polynomial, and is otherwize defined if $f$ is analytic. The relationships between the difference operator $\Delta$, the shift operator $E=1+\Delta$, and the differentiation $D$ are expressed as $$E=e^{hD}$$ and $$\log(1+\Delta) = \log E = hD,$$ where the exponential and logarithm signify the usual power series of the operators. Entry: Subspaces Date: Mon May 22 15:09:42 EDT 2006 Type: tex Start with Dedekind's law stating that if $S \leq R$ we have $$S \cap (R + T) = R + (S \cap T),$$ where $S \leq R$ denotes that $S$ is a subspace of $R$. In order to see this, it is interesting to transform this law into a less general statement. Using just the dimensions of the subspaces does not work, since a statement like $S \leq R$ is not equivalent to the ordinary meaning when the symbols represent integers, because the relation $\leq$ for subspaces is not a total order. Reducing the subspaces to those spanned by a finite subset of an basis $\{e_k\}$ allows us to talk about the bit vector $(b_N\ldots b_1)$, which indicates which of the basis vectors are combined to form a subspace, or in other words, it allows to write any subspace as $B = \sum_k b_k E_k$, where $E_k$ is the one dimensional space spanned by $e_k$. %[1] http://en.wikipedia.org/wiki/Modular_lattice Using this representation, the more specific version of Dedekind's law in terms of the bit vectors $s$, $r$ and $t$ becomes $$s \bitwiseand (r \bitwiseor t) = r \bitwiseor (s \bitwiseand t),$$ where $s \leq r$ means $s \bitwiseand r = s$ and $s \bitwiseor r = r$. This reads as $t$ augmented with $r$, filtered by $s$ is equal to $t$ filtered by $s$, augmented with $r$. Obviously this only works if $r$ does not introduce bits that are not in $s$, which is the same as saying that $r$ is ``contained'' in $s$. Entry: Poles with multiplicity > 1 Date: Sat Jun 17 18:46:16 EDT 2006 In practice (from real-world data) multiple poles don't happen since their probability is zero. If they are there, they are structural properties (i.e. system is constructed to exhibit poles with multiplicity.) However, for discrete finite structures they are important since their probability is no longer zero due to the finite number of elements. [1] entry://20090617-200427 Entry: Combining Splitting Fields Date: Tue Jun 20 12:41:48 EDT 2006 Type: tex The splitting Field of a polynomial in $\GF(2)$ has to be a field extension of $\GF(2)$. Now, does this field extension have to be $\GF(2^m)$? Well, let's see if there's a counterexample. Let's factor $(x^2+x+1)(x^3+x+1)$. The first factor splits over the field $\{p, p+1, 1, 0\}$ and the second factor splits over the field field $\{q,q^2,q+1,q^2+q,q^2+q+1,q^2+1,1,0\}$. How to combine both of them? Starting from the multiplicative groups $\{1,p,p^2\}$ ($\ZZ_3$) and $\{1, q, \ldots, q^7}$ ($\ZZ_7$) we get a group $\ZZ_{3} \times \ZZ_{7}$. However, this group is not the multiplicative group of the field we're looking for, since it is not closed under addition. For example $pq^2+1$ can not be factored as a product $p^nq^m$, but can be obtained from linear combination of these products. The additive closure of the group gives the extension field we're looking for. In general, a field extension of a finite field $\F$ is always a vector space spanned by polynomials over $\F$, ensuring this additive closure. If $\F$ already is a polynomial field, the field is then the additive closure of the products of polynomials in different formal variables, each modulo an irreducible polynomial. In our example this gives a proper extension of ${2^2}^{3} = 2^6 = 64$ elements $a_0 + a_1 q + a_2 q^2$ with $a_i \in \{1,p,p+1,0\}$. It has a multiplicative subgroup $\ZZ_9 \times \ZZ_7$ which has $\ZZ_3 \times \ZZ_7$ as a subgroup. Question. What is the exact isomorphism between successive extensions like that, and the normal representation of $\GF(2)$ as polynomials modulo an irreducible polynomial? In other words, how to map these polynomials in two variables to a representation in one variable? Entry: Dither, PWM, Sigma-Delta, Bresenham Date: Mon Oct 9 14:31:06 CEST 2006 What is dither[3] used for? It makes a digital system behave more like an analog system, in that it can be used to make it exhibit _graceful degradation_ instead of suddon failure. Dither can increase resolution on average, but adds noise to the immediate values. For phenomena where the average is more important than the immediate value, this can be an interesting trick. A simple example of this is a first order difference equation in finite precision. Suppose I'm using 16-bit values to represent oscillator periods. In order to make logarithmic increments, I could use a difference equation y[k+1] = (1+a)y[k] where a is a small constant. However, if ay is below hex 0.00008000 the value of y remains constant. In this case, average behaviour is important, so using 0.0000rrrr as rounding term, where rrrr is a 16-bit uniformly distributed random number, gives an approximation of the correct behaviour. Random number generation (RNG) for this application is most efficiently done using a finite field primitive polynomial. However, care needs to be taken to check the statistical properties in case the same RNG is used to drive several dithering filters, although I can't see any pathological cases at first glance. If clear patterns in the output are not really a problem, a simpler solution is to just use a triangle wave as dithering function, which gives effectively a PWM output. What I'm really intereseted in is a dithering scheme where the ripple is smaller after integration of the output. This requires a higher frequency quantization step. For example, instead of using a 01 ramp, a 81 ramp can be used. What exactly is the relation to S-D modulation and the Bresenham[2] line drawing algorithm? Let's make a list. * uniform dither: Dither is from a uniform noise source. * PWM: Dither from a square wave. Same average as uniform dither, but clear pattern. Large ripple after integration. * nPWM: Like pwm, but using a triangle wave modulated to Nyquist frequency. Smaller ripple. * S-D: Minimizes ripple after integration. * Bresenham: Uses higher state resolution to generate a proper average. Ordinary rounding quantization. The reason PWM is used over S-D in motor control (as I guess from PWM modulators being part of uC hardware, but S-D not) could be switching efficiency. Larger ripple is tolerated if higher switching efficiency (lower switch frequency) is attained. See Don Lancaster's magic sinewaves[1] for explicit minimization of switching events for sine wave generation as an alternative to PWM. When low ripple amplitude is more important than efficiency, for example in audio applications, S-D modulation can be used. Question: can PWM hardware on uC be used to generate S-D signals? [1] http://www.tinaja.com/glib/msinprop.pdf [2] http://en.wikipedia.org/wiki/Bresenham's_line_algorithm [3] http://en.wikipedia.org/wiki/Dither Entry: LED lights on a circle Date: Thu Jan 11 16:47:43 CET 2007 I ran into an interesting geometrical or topological problem. Working with N-simplexes to drive LED lights using a minimal amount of electrical nodes. The problem I have is how to associate an N-symplex with a circle in a natural way. For a tetrahedron this is trivial, but what with higher dimensional things? It's an interesting problem, because it seems to make absolutely no sense to do so. There should be no reason why a higher dimensional polytope should be naturally projectable on a 2-sphere. But i can't refute it just like that. To rephrase the problem in a more down-to-earth wording: I want to create a 2-sphere covered with lights in a way that the lights are normally distributed, and i have a certain pattern to access them. At the same time, i'd like to do this with as little connections as possible. Note that simplexes are easier to work with as they follow the binomial expansion for vertices, edges, triangle faces, tetrahedral volumes, etc... Trying the simple thing of projecting a 5-plex on a sphere. Compared to the 4-plex (tetrahedron), there is one pair of crossing lines. Entry: Splines Date: Sat May 5 21:56:32 CEST 2007 Type: tex Need to figure out polynomials and splines. I'm limiting myself to uniform splines atm. Let's start with the Bernstein polynomials, which come from the binomial expansion of $$ 1 = [x + (1 - x)]^n = \sum_{k=0}^n \binom{n}{k}x^n(1-x)^{n-k} = \sum_{k=0}^n b_{k,n}(x) . $$ A polynomial written in terms of the basis $b_{k,n}$ is a Bezier curve, and its coefficients are called control points. Entry: Chaotic Oscillators Date: Wed Nov 21 01:08:37 CET 2007 making chaos with a mixer usually involves an EQ, feedback and gain. the nonlinear element is a saturation. i've never seen this particular cicuit realized as a special purpose chaotic oscillator. lots of comparator based stuff, but no saturation.. the simplest i can find is this one: http://www.ecse.rpi.edu/~khaled/papers/iscas00.pdf which uses 3 integrators, a comparator and a summing amp. i'd like to get the parts count down to 4 amps, so i can use a quad. this means the summing amp needs to be incorporated somewhere else. i'd like to try the following. on the plane i made a (faulty) circuit with a svf with positive feedback (timeconstant t = 1 for simplicity) x'' = -x + a x' if a > 0 this circuit is unstable. for a < 0 this is a standard biquad. now, if the integrators can be biased to some voltage, this bias can be derived from a smith trigger acting on one of the state variables x or x', switching between two unstable points. the st is stateful, so chaos is possible on two planes: R^2 x {1,-1} ( what i did wrong is to apply the bias to the (+) input, which can't be correct since the capacitor voltages are relative to this point, so changing bias would also change the state variables. ) so.. what about using the saturation of the opamp? by measuring the voltage at the noninverting inputs, saturation can be detected. in the phase plane of an svf, there are 4 points where saturation can occur. using the general timer principle: * detect a certain condition (one of the state variables saturated) * discharge one of the state variables see: http://www.scholarpedia.org/article/Chaotic_Spiking_Oscillators switched 2nd order: a classic unstable biquad with state variables x and y, where x is decremented with the output of a schmitt trigger before it's fed into the y integrator and the integrator chain input summer. see: http://www.cs.rmit.edu.au/~jiankun/Sample_Publication/IECON.pdf which basicly contains the circuit i just (re)invented.. EDIT: some refs from johan suykens: M.E. Yalcin, J.A.K. Suykens, J.P.L. Vandewalle, Cellular Neural Networks, Multi-Scroll Chaos and Synchronization, World Scientific Series on Nonlinear Science, Series A - Vol. 50, Singapore, 2005 Entry: Chaos and Clipping Date: Sun Dec 16 18:03:49 CET 2007 It's not so hard to understand the mechanisms of chaos in 3D switched systems. In fact, they are quite simple to construct: * In one (of two) regimes, there is one (2D) mode performing an unstable oscillation in a plane together with a (1D) mode that decays toward that plane. * Put the switching barrier (plane) such that whenever a point crosses that plane (due to the unstable 2D oscillation), it is attracted fast towards the 2nd regime unstable plane by the decaying mode, and will end up close to the unstable fixed point of the unstable oscillatory mode. Maybe linear distorted (clipped) systems are easier to understand as switched systems too? Entry: The Hough Transform Date: Thu Jun 26 15:18:38 CEST 2008 Scan an image and collect histogram information for the parameters that define the lines in an image. The parametric equation for a line is the set of points {p0 + a p0' | a in |R} where p0' is the vector p_0 rotated by +90 degree. or with p0 = r0(cos t0, sin t0) the set { (r0 cos t0 - a sin t0, r0 sin t0 + a cos t0) | a in |R} eliminating a in these 2 scalar equiations gives the equation of ponts (x,y) in terms of the parameters (r0,t0) LINE = { (x,y) | x cos t0 + y sin t0 = r0 } Turning this around, given a point (x0,y0) all the lines that pass through it are parameterized by the equation BUNDLE = { (r,t) | x0 cos t + y0 sin t = r } Given this, how to find lines? We let the points vote for bundles, then find intersections (hot spots) in bundle parameter space. For each point in an image, record its brightness, and use that as a vote for all the lines through that point. This can be done by dividing the (r,t) bundle parameter space in a rectangular grid, and let each point (x0,y0) vote for the points that are in the bundle it defines. If the number of angle bins is in the same order as the number of input point bins, this algorithm casts N votes per point (x0,y0) giving cubic complexity O(N^3) Simplifications: * Since this is a voting algorithm, knowing where to NOT to look can reduce the search space. * Thresholding: don't let dark points vote: If a point doesn't lie on an edge, it doesn't need to update the array either. If possible, reduce the input to a binary image. Subdivision: As long as the number of angle bins is kept constant, the algorithm has constant complexity whenever it is subdivided. The quality of the estimates probably goes down, as there are less discriminiating votes. In our case (radial lens distortion) subdivision is probably necessary. Algorithm: For all points: if white then update accus r(t) = x0 cos t + y0 sin t Implementation: Loop order: working per scanline y allows the expression (y sin t) to be constant, it thus needs to be drawn only once, after the accumulated intensity for that scanline is known. Since the same trick is not possible for the x, this only halves the number of updates, so is probably not worth it: its easier to transform the (x,y) into amplitude + phase, which also halves the update time. Memory access: the input can be streamed, but the state update goes all over the place. Since the input, if binary, has very little memory, it might be more interesting to perfrom random access on the input image, and stream the accumulators. This requires for each point in the 2D histogram to assemble a list of points that contribute to it. The equation for this list of points is not a line, but a small line bundle. (otherwise bresenham algo might have been used). This does however raise the interesting question: can an approximate algorithm be used that does scan lines like that? Can the image be preprocessed so lines get a bit 'fatter' ? Summarized: * Staying with the naive implementation, the iteration goes column wise over the input points, and for each point a sinusoid is added to the histogram bins. * Going the experimental route: algoritm might be reversed so that each (r,t) parameter can be directly computed using a Bresenhem based function evaluation kernel. This makes sense if the image data fits in cache, or can be made to use prefetching (lines are quite predictable access patterns). Probably numerical gradient can be used to do this iteratively: this would allow for non-histogram based incremental approximation. ( However, the space around the optimal points in the picture here: http://en.wikipedia.org/wiki/Hough_transform are not really smooth. In fact, their shape is quite odd. Is that an artifact of chopping things in bins? ) Note that the last comment is called the Radon transform http://en.wikipedia.org/wiki/Radon_transform Entry: Discretized integral transforms: domain / codomain loop order Date: Thu Jun 26 18:32:28 CEST 2008 Moving from the Hough transform -> Radon transform is an example of a more general pattern of re-arranging control around data dependencies in discretized integral transforms, basicly chainging the order of loops. The central idea is this: for finite versions of an integral transform, there exists a related finite histogram based version with inverted control. Instead of computing the integral for each sampled codomain point from all the domain points, one computes the contribution of each domain point to all the codomain points. Basicly, in the discrete version, the domain and codomain loops are transposed. For example, a generic integral transform _ _ (y0,y1) = _/ _/ F(x0,x1,y0,y1) dx0 dx1 x0 x1 can be rewritten, after discretization as 4 nested loops over y0,y1,x0,x1 (outer->inner): for each y0,y1 the result is computed directly. However, it can also be computed as an update of an y0,y1 grid by re-ordering the loops x0,x1,y0,y1. For the Hough transform there is only one integral, so there are 3 loops to reorder to get to a discretized Radon transform. Is this useful? The only scenario i can think about is the case where the histogram version is faster to compute than the direct version, but you need some estimate before you use the direct version in an iterative algorithm. Maybe this only makes sense for line integrals, not for full 2D->2D maps. Entry: Snowcrash encoding Date: Thu Jun 26 19:22:43 CEST 2008 The basic algorithm uses a single LFSR (linear feedback shift register) to construct a 2D grid by: * partitioning X and Y state space into 2 separate segments so the same difference equation of degree D can be used for both. * shifting each column by a constant number N, and performing an XOR between X and Y Let's call the observations G[i], and the hidden sequences X[i] and Y[i]. These use addressing A(x,y) = A[x+Ny]. Alternatively, primed array names represent transposed addressing A(x,y) = A'[xN + y]. The equations from LFSR state dependency and output mixing: X[D+i] = f( X[i] ... X[D-1+i] ) 0 <= i < (N^2 - D) Y'[D+i] = f( Y'[i] ... Y'[D-1+i] ) 0 <= i < (N^2 - D) G[i] = X[i] + Y[i] 0 <= i < (N^2) --------- 3N^2 - 2D The number of unknowns (X[i] and Y[i]) is 2N^2 ( The input data consists of N^2 points. ) With independent equations, this means it is soluble if 2N^2 <= 3N^2 - 2D or D <= N^2 / 2 N >= sqrt(2D) For D=15 -> N=6 I don't see why this should be 7 as in the current implementation. Maybe it's used to reduce ambiguity? Something 'feels wrong' about using a shift of 6 and a detection of 7x7 : the sequences don't simply wrap around any more, which distroys some symmetry. Q: is it possible to solve the quations in least-squares form, by adding an error term to each? Q: is there a way to test orientation properties by using some property of the relationship of the LFSR to its inverse? If the parameter space is known, it can be proven by enumeration that ambiguity can always be resolved. Maybe that is really a better approach, because I don't see a way to do it otherwise. Entry: reverse LFSR Date: Fri Jun 27 13:31:59 CEST 2008 What happens if you reverse an LFSR sequence? This is most easily seen by expressing the LFSR as f(x[i] ...) = 0. The only thing that happens is the reversal of the coefficients. So 100101 -> 101001 Now, the question is, are there LFSR equations that are symmetric in this sense? This would also mean the sequence itself is symmetric around every point, which doesn't look like it's possible. Next step: is it possible to relate an LFSR to its inverse? After all, all galois fields of order 2^N are isomorphic. It might be interesting to cast the equation into exponential form to work with multiplicative generators g^i. Are all LFSR sequences of a certain order related by mere re-arranging, like taking every 3rd bit? Hmm... getting confused. Let's view this in a transformed domain. The LFSR sequence is the coefficient of the sequence of polynomials generated by multiplications by x, the 'shift' operator. (This turns the modulo operation into a simple XOR.) In matrix/vector terms, given an LFSR polynomial a of order N, pick an initial point in the state space e = {e1,...,eN). Take the update equation of the LFSR, and call its matrix g. This is the multiplicative generator for the sequence. Q: Another thing to think about: to compute eigenvalue decomposition in |C one needs complex eigenvalues (the splitting field). However, there exists a 'real form' of this which has blocks of 2D rotations instead of complex diagonal forms. Is this possible for LFSR? Q: how many different sequences are there for a given Galois field? (number of automorphisms) Entry: LFSR resolving linear orientation ambiguity Date: Fri Jun 27 15:28:37 CEST 2008 Q: What happens if the LFSR in one direction is XORed with its reverse? does that still give a proper sequence? That way it would be possible to have contribution of 2 directions, which then can be used to resolve ambiguity. Equations: ( with A'[i] = A[N-i-1] ) L[D+i] = f( L[i] ... L[D-1+i] ) 0 <= i < (N - D) R'[D+i] = f( R'[i] ... R'[D-1+i] ) 0 <= i < (N - D) X[i] = L[i] + R[i] 0 <= i < N --------- 3N - 2D Number of unknowns: 2N D <= N/2 The only problem then is to figure out whether the systems are independent. Does this sum of sequence + reverse somehow reduce to something of lower degree? For one thing, it is reversal-symmetric around the initial state, which can be taken not to ly in the surface of interest. Is there anything else destroyed? In terms of the unit element e, and the generators g and g^-1 the sequence becomes a cosine instead of an exponential: e g^i + e g^{-i} = e (g^i + g^{-i}) This can no longer be written as an LFSR, because it has hidden state (LFSR has a naked state). It is a more general form of linear system. I don't se any reason why this can't be generalized to the 2D case. Q: Are there points where this XOR operation itself leads to ambiguities? In other words, are there XORed segments that have multiple parents? If so, then it is probably possible to find a small upper bound of the number of observations to take to eliminate this ambiguity. ( Since an large ub is probably easy to find: take one that reduces the search space for collisions to a small amount, i.e. one case. ) Note that if there are such ambiguities then a simple exhaustive search for soluble sets of equations in the non-symmetric 2D case might be a better approach. Entry: symmetric 2D LFSR Date: Sat Jun 28 12:52:36 CEST 2008 For a 2D LFSR grid, the problem to solve is rotation ambiguity. Instead of using an LFSR generator for each direction, use 2 like in the 1D case, but tile them such that the structure is rotation-symmetric. I.e. for N=3 the 4 LFSR are wrapped as: X+ Y+ X- Y- 1 2 3 3 6 9 9 8 7 7 4 1 4 5 6 2 5 8 6 5 4 8 5 2 7 8 9 1 4 7 3 2 1 9 6 3 Analogous to before, this gives 5N^2-4D equations and 4N^2 unknowns, leading to the solvability condition: N^2 >= 4D This looks like a nice idea, but using a 6x6 estimator for D=15 gives underdetermined equations.. Apparently, there is some information lost in the XOR. Intuition: do something like this in |R and you might reduce condition, but it is hard to get to linear dependence. In a finite space it is a lot easier to end up with linear dependence because there is less room. Entry: Lens correction Date: Sat Jun 28 13:15:43 CEST 2008 Grid estimation with lens correction, roadmap: - Lens parameter estimation: cubic distortion + optic center -> try this with manual approach first. - general flow: sobel edge -> histogram for direction -> projection on 2 orthogonal directions -> grid phase + frequency. - Optionally: search the LFSR 2D space for sets of equations that are not soluble. Entry: 2D direction estimation (confused) Date: Mon Jun 30 14:20:24 CEST 2008 I first thought this would be Hough transform in r,t with the radial direction integrated out. However, that doesn't work because of the way accumulation works: each t is voted for in the same way, but with different r. I can't see how a bundle-accumulator would work for estimating direction. How can a point vote for a direction? Doesn't make sense. What about hierarchical Hough transform? We know there are only 2 directions, and they are orthogonal, so basically, there is only one. This looks like a successive binary estimation problem. First choice between t=0,pi/4 But, doing it this way: how to pick one of the two angles? The result will be 2 accumulators for an r spectrum. We're not interested in r, but it will look like a spectrum with multiple (blunt) peaks. The decision rule could be something like high-pass content (local variation). Entry: simplified HT for direction estimation Date: Tue Jul 1 11:12:56 CEST 2008 Essentially, a Hough transform H(r,t) of an image I(x,y) can be seen as a likelihood function for the parameters r and t. Now observe a typical 'sinogram', like the one here: http://en.wikipedia.org/wiki/Hough_transform Given the Hough transform of a single line, take an r-spectrum for a certain angle: it will will have less variance when the angle is close to the correct one. (The integral of the r-spectrum sums to the number of line-points that voted.) It would be interesting to be able to ignore the fact that there are multiple lines, and have them all contribute to a single parameter. In fact, the first harmonic in the DFT of the r-spectrum will give an estimate for the phase and offset. Let's try this. Entry: Generalized Pitch Detection Date: Thu Jul 3 16:26:11 CEST 2008 I'm not sure if I can capture my intuition about this, but let's try anyway: There exists a class of problems which exhibit somewhat periodic cost functions: let's call them Generalized Pitch Detection problems (GPD). The problem with such problems is to add bias to the estimator so it will go for the desired pitch, and not one of its (sub-)harmonics. The grid estimation problem in Snowcrash is a GPD problem. Entry: LFSR: roundup Date: Sat Jul 5 14:45:13 CEST 2008 After a couple of days of tinkering in the Scheme lab, thinking about algorithm complexity and being distracted by HOP for prototyping (which works nice btw..) it's time to get something done. Tasks: (A) lens distortion estimation from running example. (B) grid parameter estimation Approach: do both based on Hough transform. First one: local tx, second one: global tx with transformed coordinates. I do need move from Scheme to C code because I'm running into trouble with execution time. HOFs are nice, but parametricity is slow.. Entry: Hough over 0->pi/2 Date: Fri Jul 11 17:28:55 CEST 2008 ( exploration log ) Got the Sobel + HT working in PF; drawing some nice pictures. Thinking about letting theta + pi/2 vote for theta. What should r be? Hmm.. This gives two mixed r spectra: for a certain theta, grid might shift, so this doesn't seem like a good idea. The pictures help a lot to build some intuition though.. Looking at it, it doesn't seem too difficult to estimate scale, phase and angle, right from the image, using the right biasing (filtering). Try some multires? With r,t going over their full range, there are 4 angles, pi/2 apart. How to change the transform such that a symmetric diagram over 0->pi/2 is possible? The full diagram is redundant, since for each (r,t) there is an (-r,t+pi). The first reduction is to either halve r or halve t. Let's halve t. I'm using only one quadrant for x,y. Maybe change that too? Yes. Gives a prettier picture: the whole (r,t) plane is used. What about this one: Compute the sum of squares over r. Since we know the integral is constant, the integral of squares will give some distinction between highly concentrated regions and spread-out regions. Is this entropy? No, entropy is \sum p_i ln p_i p_i don't need to be normalized -> it offsets the entropy by ln \sum p_i Both entropy and energy are superlinear nonlinearities, so will emphasize outliers. Which one is the more natural measure? Interpretation of the r spectrum as a PDF is quite straightforward, but interpreting it as an energy, not so.. Maybe the potential energy of a heap of sand? Potential energy is proportional to height, integral from 0->top is quadratic in top. Plotting the energy seems to give quite a clear distinction. However, this still requires the construction of a histogram. Instead of using energy or entropy, another way to introduce bias is to assume the r period R is assumed to be known, it can be used to construct a filter: simply summing (cos 2 pi r / R, sin 2 pi r / R) So, instead of voting for the bundle (r,t) | r = x cos t + y cos t, t \in [0,pi] we vote for the phasor: (e^{i r/R} , t) The advantage is that the r dimension can be reduced by immediately accumulating the phasors (non-periodic contributions will interfere destructively). This basically demodulates the grid carrier. My intuition says it's possible to get away with fixing one parameter to get a simpler/faster kernel function that can then be used to perform incremental refinement of estimates of that parameter. This also might benefit from representing the image as binary data, reducing memory accesses. (Done: I don't see dramatic speedup, maybe because at that time it was still compute bound due to sin/cos evaluation?) ( Also, it might help to switch to the Radon transform, to compute the line integral directly. Hmm.. that's not the same: Radon transform would help if we're looking for lines, but here there is some benefit to computing a whole r-spectrum, which can then be reduced to a single phasor. ) Entry: Toroidal Hough Transform Date: Sat Jul 12 13:49:32 CEST 2008 For grid estimation, we're not really interested in individual maxima on the r-spectrum. Reducing this spectrum to a couple of values can be done by adding bias toward a period R. Then, instead of using a voting algorithm for bundles: (r,t) | r = x cos t + y cos t, t \in [0,pi] we vote for the phasor: (e^{i r/R} , t) The advantage is that these votes can be accumulated immediately, so memory access becomes simpler, and a histogram interval discretization step is not necesesary. The core computation is the evaluation of cos/sin, the rest seems memory-bound. Given R and t, the grid period, compute the complex phasor p: p(R,t) = \sum_n e^{i r_n(t) / R} r_n(t) = (x,y) ? x cos t + y cos t : 0 The inner loop is the sum over the exponentials of r_n(t) for linearly increasing x with y fixed, so this can be updated. However, the image will be mostly sparse, so this will probably not bring much speedup (points are not equidistant, so multiple rotation update matrices are necessary). ( Tests indicate the 2 approachs: straight eval + update are about the same speed. Maybe it's memory bound? ) Entry: Inner loop over angle/tangent Date: Sat Jul 12 14:21:05 CEST 2008 ( I'm not sure if this is useful: only works when the angle/tangent loop is the inner one. ) Since we're not using any resolution-reducing histogram, and floats/doubles have a large dynamic range, can the parameter space be turned into something more convenient so the kernel method for evaluatiing p(R,t) becomes simpler? Goal: make a kernel method that evaluates p(R,t) in a different parameter space. Let's try to replace it by (o,m) : offset and tangent of angle, and use the computation of p(R,m) to compute p(R,t). The bundle equation for (x,m) becomes: (o,m) | o = y - m x Voting for e^{i o/O(m)} = e^{ i (y - m x) / O(m) } becomes fairly trivial: the exponential can be accumulated since it's linear in both m and (x,y). The polar coord version is only linear in (x,y), so works only for a fixed t. Entry: Accellerating the inner accumulator loop Date: Sat Jul 12 16:47:21 CEST 2008 Bring the conditional out of the loop. RLE will transform the sparse vector into an array of offsets that can be used to perform the incremental accumulation without a conditional in the inner loop. This representation needs to be computed only once, and can be reused for every p(R,t) evaluation. e^{i (x cos t + y sin t) 2 pi / R} ( Moving one step further, instead of storing offsets that will have to be transformed to phase increments once per inner loop, what about storing the phase increments directly? This doesn't make sense for t=constant however, since each increment is used only once. It does make sense when multiple R are evaluated per t. ) The inner loop with y=cte looks like: \sum_X e^{i (a x + b)} = e^{i b} \sum_X e^{i a x} Where: X = {x | image(x,y) = 1} (range of sum) a = 2 pi cos t / R b = 2 pi y sin t / R This makes the inner loop accumulation the accumulation of sin/cos of scaled versions of the RLE x values. Here a is the 'speed' at which the points in X rotate around. This depends on t: the speed is a projection of the expected speed R under angle t. Doesn't look there is a shortcut there, because of the absence of structure in the set of points X. ( With RLE, the running time is proportional to the number of white points in the input image. Maybe some preprocessing step is desired to eliminate blurry gradients? ) If sincosf (math.h) can be used, this might go a bit faster. Entry: implementing RLE Date: Sun Jul 13 10:37:17 CEST 2008 Each row incrementally adds a multiplier e ^ { 2 pi sin t / R }. Instead of performing a conditional on the input RLE data stream, this might be encoded as a single bit. Entry: Toroid plots Date: Sun Jul 13 18:02:10 CEST 2008 Got some reasonably fast kernel routine and used it to generate a theta/period plot. Gives results as expected: main period peak is quite clear, and with an initial estimate it can probably be computed iteratively. Entry: AR - ARMA Date: Wed Oct 1 17:10:57 CEST 2008 How to fit an ARMA model? It's been a while. http://en.wikipedia.org/wiki/Box-Jenkins There are three primary stages in building a Box-Jenkins time series model. 1. Model identification * Detecting stationarity: can be done by inspecting the autocorrelation. Slow decay (a flat spower spectrum without isolated peaks) can indicate non-stationarity. * Detecting seasonality: if there is significant periodicity this can be removed (modeled separately), or included in the model order estimation. (The idea being that periodicity comes from external inputs, something which the ARMA model doesn't accomodate.) * MA order (q) selection from autocorrelation plot. For an exact model, this becomes zero after lag = q. * AR order selection (p) from partial autocorrelation plot. (CHECK THIS). This could be automated using information-based criteria such as FPE (Final Prediction Error) and AIC (Aikake Information Criterion). 2. Model estimation Once a suitable model order is found, use a NL-LS or ML method to estimate the model parameters. 3. Model validation The error term is assumed to follow the assumptions for a stationary univariate process. For ARMAX the approach is similar? So, what's the difference between using NL-LS or ML methods? Linear least squares corresponds to maximum likelihood if the errors have a normal distribution. It looks like this is no longer the case for NL-LS. Entry: motor control Date: Mon Oct 6 01:18:51 CEST 2008 Something I forgot about modeling motors is that for AC induction motors, the torque-speed curve is quite nonlinear. First, an asynchronous AC motor is basicly a transformer where the load of the secondary is the rotary dynamic attached to the axis. This load can be modeled as a torque -> angular velocity transducer. T -> LOAD -> w_L The controller that closes this loop maps w_L to T through the the motor's torque-speed T/w characteristic, which has two inputs: voltage V and synchronous velocity w_s. V,w_s | V w_L -> M -> T The nonlinearity between w_L/w_s and T is motor-dependent. This was helpful: http://www.sea.siemens.com/step/templates/lesson.mason?ac_drives:2:1:1 http://en.wikipedia.org/wiki/Direct_Torque_Control Torque can be estimated from the motor current+voltage. This allows to build a controller with simple measurement sensors, but doesn't allow zero-speed control. ( Explain the flux-linkage estimation. ) With accurate speed/position measurement better control is possible: http://en.wikipedia.org/wiki/Vector_control_(motor) http://en.wikipedia.org/wiki/Variable-frequency_drive "AC motor characteristics require the applied voltage to be proportionally adjusted whenever the frequency is changed in order to deliver the rated torque. For example, if a motor is designed to operate at 460 volts at 60 Hz, the applied voltage must be reduced to 230 volts when the frequency is reduced to 30 Hz. Thus the ratio of volts per hertz must be regulated to a constant value (460/60 = 7.67 V/Hz in this case)." (The idea behind this being that sending DC through a coil is a bad idea: when the frequency is reduced, the reactive part of the impedance goes down, which makes the resistive part more significant, resulting in more heat dissipation.) Entry: FFT window design for sinusoidal synthesis Date: Mon Oct 6 13:30:26 CEST 2008 (Mostly note to self. FIXME: explain this a bit better.) In 2000 I tried to design windowing functions for spectral synthesis: synthesizing a sum of sinusoids by constructing a DFT spectrum through updates of a limited number of frequency components per sinusoid. I apporached this as a FIR filter design problem. However, I came to realize that approaching this as a filter design problem, focussing only on stationary signals is not sufficient. The real problem is about amplitude modulation effects that happen when mixing subsequent stationary signals, as a result of phase discontinuities. The only way to tackle this properly seems to be to increase the spectral update rate. It would be interesting to see whether this can be solved in the DYNWAV model: it already has a variable frequency mechanism (wavetable playback) which could be used to capture "average" frequency change without amplitude modulation effects. However, synthesizing multiple pitches still gives the same problem. Entry: I/O delays in Digital Control Date: Wed Nov 19 16:40:31 CET 2008 I got a bit confused by conversations about latencies of digital control systems, but it is quite straightforward: For a synchronously sampled control system with input and output updated at t:0,1,... the input data sampled at t=0 can only influence actuation at t=1 (when process delay Z=1) and as such the effect of this actuation can only be measured at t=2. Entry: Dual Numbers and Automatic Differentiation Date: Sun Feb 22 15:08:07 CET 2009 Extending the real numbers with e satisfying e^2 = 0 produces the dual numbers, equiped with extended elementary operations. Besides being nilpotent, e behaves as a real number. A dual mumber has two components a and b, and is denoted as a + b e Given an analytic function f(x) over the real numbers, the extended function over the dual numbers has the following remarkable property: f(x + e) = f(x) + f'(x) e This can be derived starting from the power series expansion of f(x) at x = 0, and lifting all operations to those of the dual numbers. The entity e effectively selects the first derivative from the power series. Practical significance: given an algorithm for computing f(x), an algorithm for computing both f(x) and f'(x) can be automatically derived by extending all elementary operations to their equivalent on the dual numbers. The "magic" is in that in general, single-expression function derivatives _look_ much more complicated than the original function expression. This complication is artificial: it can be alleviated by introducing proper decomposition. This is exactly what AD does: complexity doesn't really rise, but the amount of data that is tracked in the computation is increased: there are more local dependencies in the data flow, but global size doesn't explode. Instead of using dual numbers, the same mechanism can equivalently be explained as explicity tracking both function value and deriviative, and updating them using the chain rule. [1] http://en.wikipedia.org/wiki/Automatic_differentiation Entry: Graceful degradation of integer values Date: Wed May 6 13:52:21 CEST 2009 Transmitting binary digit-encoded values is problematic in the sense that not all digits have the same noise sensitivity in terms of the encoded integer value. I.e. binary 1100 = decimal 12 binary 1101 = decimal 13 (diff = 1) binary 0100 = decimal 4 (diff = 8) Is there a way to encode a number such that each flipped bit will cause the same (or similar) error in the integer value, and not too many extra bits are used. One way to do this with lots of redundancy is to use a PWM representation. I.e. for 4 bit values, this gives 16 bits representation and a lot of redundancy. However, can this principle be used in a way that isn't too costly? 0 000 1 100 010 001 2 011 101 110 3 111 Using bigger numbers the redundancy explodes quite fast. Is there a way to "flatten the bulge" in the redundancy curve? This is related quite a bit to the concept of entropy. Patterns where the numers of ones and zeros are equally distributed are more common because there are more variants. What about using 100 010 001 as representations of 1,2,3. That way bit flips will only jump between "bands" and no longer all over the place. 0 {000} {0} 1 {100 010 001} {1,2,3} 2 {011 101 110} {4,5,6} 3 {111} {7} This produces a lattice structure. Now, is there a way to do this recursively? Convert each number to transitions. 100 10 010 11 001 01 The same structure is found the the other section since it's simply the inverse. Hm.. I don't see it. What I see is that numbers near the middle bands are all equal in some way. It's as if this says that only black and white are really important (they are outliers), and the rest is not so special. It's the binomial expansion of (1 + x)^n 1 1 1 1 2 1 1 3 3 1 1 4 6 4 1 Can this be used in control systems? If the errors are small, we don't really care much about their exact value, just their sign and approximate size. If the errors are large, we really want to know them well. Suppose a we have n wires carrying a single signal. If they're all 1 or all 0 that's significant. If they are equally distributed that's not significant. So, this is quite a natural representation of "normalness". Flipping one bit won't do much, but the ensemble has a meaning that can be interpreted as a number. Now, can we count with these numbers? Or at least treat them as values to which we can apply functions? Let's define this better. - A node is a binary function of time. - A number is a set of nodes. - The sum of two numbers is the union of their sets. - The weight of a number is the cardinality of its set. - A "reduction" translates a number of high weight m to one of low weight n, m > n. Computation seems is hidden in the reduction operation. Let's define one for cardinality 2->1 This should be a transition of classes, probably random to eliminate arbitrary choices. The "outer" classes will remain while the inner clas picks a random choice. 2 ----> 1 00 | 0 01 10 | 0 1 11 | 1 Similar for the next in line. 3 ----------> 2 000 | 00 100 010 001 | 01 10 011 101 110 | 01 10 111 | 11 Now, can reduction be implemented with an electronic device that has some kind of balanced internal noise source? Now.. 3->1 can be implemented as a deterministic majority gate. But it seems that the big idea in the proposed computer is its indeterminate nature. So, with completely fair n -> (n-1) transitions, what is the probability of an n->1 transition? Wait. The simplest gate is really to just drop one bit randomly. (A computer full of Kill Gates?) Maybe random killing isn't the trick, maybe random permuting is? Then killing is simply not using an output. With a primitive in the random transposition gate. Enough insanity.. There might be something hidden here, but it is really dependent on cheap connections. It probably needs a 3D structure to be implemented. Random transpositions don't seem so difficult.. It's basically a beam splitter. Entry: Counting votes. Date: Mon May 11 09:54:21 CEST 2009 This is a rehash of the previous post. For bit vectors b of dimension n, the function bits : 2^n -> [0..n] introduces a partition[1] of 2^n. The equivalence classes have an order relation, determined by the bits function. The central idea in using this partition to represent numbers in terms of bit vectors is the distribution of bits(v) when v is drawn from a uniform distribution. It is a binomial distribution[2]. Such a distribution naturally encodes "agreement" (most bits zero or one) as an exceptionally ordered state, and "disagreement" as the natural disordered state. This non-linearity is quite a good match to the non-linear behaviour of neural networks when modeled as an inner product followed by a limiting nonlinearity, without ever using any explicit computations. Simply concatenating a number of such vectors gives a resulting vector that represents anything on the scale of no->disagrement->yes. The size of an individual bit vectors dermines its weight in the vote: v_out = [ v_1 | ... | v_n ] Let's call a single bit vector v \in {0,1}^n a "vote". Each vote has a weight n which is its dimensionality. Now, to make this managable, the only real "computation" that has to happen is to reduce the weigth of votes so they can be combined with other votes to form new ones. The essential observation is that the distribution of a vote with weight n is qualitatively similar to the distribution of a vote with weigth n-1. In other words, simply discarding one element of the bit vector on average has no qualitative effect. Let's call this "weight loss". The same reasoning goes for randomly flipping a bit. On average this has little effect, giving rise to good noise immunity. This gives some form of computation (it can do anything feedforward neural networks do) in a statistical way. The next step is to make weight loss programmable. A vote should be able to inhibit another vote. The problem here is that it is a physical operation. Inhibition means to cut wires. Is it possible to do a non-phsyical inhibition? Yes. The and/or gate to introduce assymetry between yes and no. One vote could be combined with another vote (gate it). So, what then is a nerve cell? It's where multiple wires come together and a random part of them is discarded. Forgetting is the essence of computation. ;) So, given a (huge) set of binary nodes, there are only two operations: - reduction: taking a subset of signals to create a node - introduction of assymetry (inhibition through AND|OR) = gating [1] http://en.wikipedia.org/wiki/Partition_of_a_set [2] http://en.wikipedia.org/wiki/Binomial_distribution Entry: Computer Modern is Too Thin Date: Mon May 11 18:11:07 CEST 2009 Not really so math-y but I have no other place for it.. Something that has been bugging me ever since I used LateX for the first time (1998?) is that Computer Modern is a very thin font. Looking around the web, I've found several people complaining about this. On paper this problem isn't as pronounced because resolution is high enough, and black is black. But on screen, thin lines turns grey, and that is really annoying. It's easier to read black blobs than grey ones. On blurry low-quality paper, ink doesn't turn to grey either. So, how to fix this? The simplest way to fatten up a font is to use erosion (replace pixel with minimal pixel value in a 3x3 square). Alternatively, blur followed by increased contrast will also work. So, why doesn't ink on paper turn grey when it spills out a bit? The reason is probably that what is diffused is pigment (amount of absorption or negative color) and not light intensity. Even with pigment diluted, black is still very black. Code: #include #include int min (int a, int b) { return (a < b) ? a : b; } int main(int argc, char **argv) { int w=0,h=0,d=0; size_t size = w*h; unsigned char *buf; unsigned char *obuf; int l,c; again: if (3 != fscanf(stdin, "P5\n%d %d\n%d\n", &w, &h, &d)) exit(0); printf("P5\n%d %d\n%d\n", w, h, d); size = w*h; buf = malloc(size); obuf = malloc(size); fread(buf, 1, size, stdin); for (l = 1; l < h-1; l++) { int line = l * w; for (c = 1; c < w-1; c++) { int l1 = buf[(line-w) + (c-1)]; int l2 = buf[(line-w) + (c)]; int l3 = buf[(line-w) + (c+1)]; int l4 = buf[(line) + (c-1)]; int l5 = buf[(line) + (c)]; int l6 = buf[(line) + (c+1)]; int l7 = buf[(line+w) + (c-1)]; int l8 = buf[(line+w) + (c)]; int l9 = buf[(line+w) + (c+1)]; obuf[line + c] = min(min(min(min(l1,l2), min(l3,l4)), min(min(l5,l6), min(l7,l8))), l9); } } fwrite(obuf, 1, size, stdout); fflush(stdout); free(buf); free(obuf); goto again; return 0; } Let this operate on images generated by "pdftoppm -gray -r 300" before passing it to a DJVU converter. Don't do this on lower rez. Entry: Throwing away the right information Date: Tue Jun 2 14:10:46 CEST 2009 While the computation mechanism explained in the previous posts might be interesting as a mathematical toy, I'm not sure whether building something like that is practical. It requires a huge amount of connections. A human brain contains about 10^11 cells with on average 7x10^3 connections. In log10 that's 11 and 3.84 with a ratio of 2.6 The thing is though, the architecture has failure built-in. So manufacturing failure should probably not throw off things too much. It can be built with low yield, might might drasticly reduce fabrication costs, to a point where it might be feasible in a home lab with something other than sillicon. Entry: Analog vs. Digital Date: Tue Jun 2 14:24:28 CEST 2009 My original idea started with bringing back uniform error sensitivity to digital systems: in a digital system, it is possible to get error-free data transfer with an arbitrarily high probability. However, when an error does occur, it is usually fatal. This is in stark contrast with analog communication: errors will degrade the signal, but are far from fatal. The information is of a different kind: there is always some error but "we can live with it" and a little more error makes a little more annoyance, but no fatalities are introduced abruptly. Now, is there something inbetween these things? Is it possible to use the signal re-generation property of digital systems with graceful degradation observed in an analog system? In other words: contain errors locally, but make sure that noise that gets promoted to signal does not have global effect. This seems to be different from error detecting/correcting codes: these work well up to a certain noise level where they completely fail. This is more about representation of data. About what a _number_ really is. Voting has this property, but it also is extremely wasteful for representing "don't care". TODO: make this a bit more formal and formulate the computation properties as continuous statistics of a discrete (limit->inf) system. Entry: Counting votes. Date: Sun Jun 7 15:06:40 CEST 2009 Representing a non-negative integer as a number of 1 bits in a bit vector allows the use of concatenation as addition. Let's investigate the effect of dropping 1 bit at random from a length n bit vector. More specificly, given the expected value of the number of 1 bits in a vector, calculate this value after dropping one bit at random. Entry: Sigma-Delta analog synth Date: Tue Jun 9 11:13:03 CEST 2009 I think Sigma-Delta[1] modulation is a great idea. The idea behind this method is what triggered the previous posts. It has been part of my lingering background noise for a long time, influencing a lot of the ideas that went into the Sheep synth. This modulation scheme has an aestetic appeal I can't really describe. So, to get to know the modulation form a bit better, I'd like to build a modular analog synth using this representation for signals, by patching up a couple of PIC and discrete logic chips. However, this is probably more suited to FPGA implementation. Let's see what operations are supported: * switch/mix: interleave two streams * distorted amp: perform AND/OR on an unbiased signal * inversion: XOR with a stream of known average * differential: pulse->edge: amplitude to frequency conversion. Some real-world hacks on top of this: * A->D: PIC's 12-bit AD Modulation Instead of using higher order filters as in an SD tailored to accomodate a low-noise band of arbitrary shape, we use an n-bit wrap-around accumulator as a discrete integrator. To convert an n-bit signal to this representation, simply accumulate it. This yields a "digitial phasor" which rotates around 2^n states over time. Observing the carry bit of the accumulation as a binary signal yields a signal with the same average as the original signal. The reason this encoding can be used as a representation of audio is that the error is small for low frequencies (where it matters). Almost all encoding errors are present in the higher frequency bands. Using higher order integrating filters will give better results if the objective is good quality audio. See [1] for more information. However, for us the simplicity of the representation is more important, since we want to use it as a computation substrate. Its connection to frequency modulation is of key importance. Demodulation Converting from a bit stream to a signal stream can be done using accumulation and subsampling. Some information is lost in the modulated form (high frequency spectral content) and when converting back to a stream of numbers, we'll loose some more. Averages: 0, 1 and 1/2 There are 3 special signals to distinguish: signal average 0000... 0 0101... 1/2 1111... 1 The all-zero signal corresponds to a phasor that doesn't change. The alternating 0/1 signal corresponds to a phasor in a binary oscillation, which corresponds to an input signal of 2^(n-1). The latter however is a limit case and cannot be generated by the phasor mechanism. The maximum is a signal with a 0 bit once every 2^n 1 bits, reflecting the maximal input signal amplitude of 2^n-1. To simplify interfacing to the analog world, let's call 0101.. the reference signal (zero). This allows 0000.. to be the clip min and 1111.. to be the clip max amplitude. Statistics In order to calculate with SD modulated signals and prove some of their properties we need to clearly define how such binary sequences relate to sequences of real numbers. It is remarkable that magnitude is encoded in an single event stream. How to fish out the magnitude info? In first approximation we forget about time altogether and interpret a binary sequence as draws from a random variable. This allows us to work with the expected value defined as: E(a) = \sum p(a=x) x With x ranging over the values {0,1} this simply reduces to E(a) = p(a=1) This mathematical model connects to the messy physical world by observing that a finite average of a sequence of a_i samples approximates E(a). Anything we say about the statistical property E(a) in some sense carries over to what the estimator produces. In practice this means that the sequences can represent time-varying low frequency signals by cutting off the integration at a particular low frequency. Attenuation The logical AND of two signals behaves as attenuation. A 0 in one signal can cancel the transmission of a 1 in the other. As long as this doesn't lead to cases where a disproportionate number of 1s is canceled due to correlation of the two bitstreams, the effect on the expected value is to be attenuated by an amount proportional to the expected value of the other signal. Representing AND as real number multiplication on the set {0,1} allows the expression of the expected value of the product E(ab) = \sum p(a_x, b_y) a_x b_y for events a_x and b_y ranging over {0,1}. Of these 4 terms only p(a=1,b=1) produces a nonzero contribution. This can be simplified to E(a) E(b) if the random variables are independent[3] meaning p(a_x, b_y) = p(a_x) p(b_y). Note that "time slots" erased in a signal can be filled up with those from another one, which might be created through an AND operation in a similar manner. This then provides a means for implementing addition without the need for recoding. This naturally leads to consider the interpolation of two signals a and b, by a third signal x OR(AND (x,a), AND(~x,b)) This fills up all available time slots by multiplexing the signals x and y proportionally to E(x). A special case of this happens when b = ~a, leading to the XOR of two signals XOR(a,b) = OR((AND(x,a), ~AND(~x,~a))) This interpolates between a signal and its polarity reversal. Note that multiplication with factors >1 is not possible locally. The signal would need to be demodulated into a form where extra energy can be added, which would then result in this being spread out over time. However, it might be possible to construct some kind of buffered mechanism that merges two streams, ensuring conservation of energy. Decorrelation Signals coming from an SD modulator are very repetetive, which means that they can become correlated, breaking down multiplication. In order for this to work some form of decorrelation must be added. Every modulator could be equipped with a dither module, which would also improve the SNR. An analogy (Thu Oct 19 22:21:22 CEST 2006) Think of an SD-modulator as a bus stop where on each timestep (say 10 minutes) a bus either leaves or not. A bus only leaves if it's full, and the number of people that arrive in a single time step is never greater than the capacity of the bus. This way, the bus stop never needs to accommodate more than one bus worth of people. Once i got this picture in my head i couldn't get rid of it any more. It clearly illustrates the conservation properties of a bus stop, which implies the SDM preserves averages and low frequency signals. References I've read some papers by Philips people about the SACD.. It included some operations that can be performed on an SACD stream without decoding it. What I remember is the use of interleaving to mix two streams. For other operations they always used (de / re) modulation around an ordinary n-bits representation. I found this[2] in Chuck Moore's site, which is related but talks about analog domain VCOs. [1] http://en.wikipedia.org/wiki/Delta-sigma_modulation [2] http://colorforth.com/immunity.htm [3] http://en.wikipedia.org/wiki/Statistical_independence Entry: Symmetries of boolean functions Date: Tue Jun 9 14:55:19 CEST 2009 Inspired by the idea from the Sheep synth that only XOR and AND are interesting ways of combining oscillators, here is some investigation of the why of that. There are 2^4 boolean functions that map 2 inputs to one. There are only a couple of interesting ones after you remove 0/1 morphism. In the following, the squares indicate their Karnaugh maps[1]. | A | 0 1 ----+-------- B 0 | . . 1 | . . Simply looking at the Karnaugh maps already suggests a number of equivalence classes: Select class: 1/3 NAND OR => <= 1 1 0 1 1 0 1 1 1 0 1 1 1 1 0 1 AND NOR N=> N<= 0 0 1 0 0 1 0 0 0 1 0 0 0 0 1 0 Transform class: 2/2 diag XOR <=> 1 0 0 1 0 1 1 0 Drop class: 2/2 non-diag A B !A !B 0 1 0 0 1 0 1 1 0 1 1 1 1 0 0 0 Ignore class: 4/0 0 1 0 0 1 1 0 0 1 1 Each class is a set of boolean function which is closed under inversion of one of the 2 inputs, or the output of the function. The classes can be constructed from a single representative f. With i_o, i_a, and i_b inversion or identitiy, the functions g(a,b) = i_o(f(i_a(a), i_b(b))) make up the class. The maximum number of elements is 2^(n+1), which occurs when all combinations of inversion and identitiy yield a distinct function. Let's define an "interesting" function to be the ones that belong to large equivalence classes. In this sense only AND and its symmetries are interesting. Note that this is => (implication). Putting it as "only implication is interesting" makes it sound a bit more profound. Instead of looking at inversions, it's also possible to look at input permutations. For the 2->1 case this seems to be a closed operation in all the classes (mirror around the diagonal). In fact, in the 3->1 case we need to include permutation symmetry. So what about 3->1 functions. Is there a similar way to classify them as interesting/boring using the same classification (inversion + permutation). There are 2^(2^3) = 256 such boolean functions. They can be visualized as 2x2x2 Karnaugh cubes or folded open to one side as a flat map: | AB : | 00 10 : 11 10 ----+--------:------ C 0 | . . : . . 1 | . . : . . : fold For 3->1 functions one just needs to look at the morphisms of the 2x2x2 binary cube. With what we find from the 2->1 case, there are some things we can expect: 1. single select (3AND, 3OR, ...) 2. embeddings of the 2->1 functions 3. irreducible structures Here 3. is expected to be an analogue of non-separable convolution kernels: increasing dimensionality introduces new structures that can't be written simply in terms of lower-dimensional ones. Looking at the 2x2x2 binary cube and trying to fill it with a pattern that can be oriented in many different ways makes me think of an L-shape like this: x x . . x . . . It can be placed in one of 8 corners, with each taking 3 possible orientations, leading to 24 symmetries. However, this one is "flat". It is independent of one of the inputs. Rotating one plane of the the 2x2x2 cube 90 degrees wrt to another one (like you would do with a rubik's cube) gives this configuration: . x x . x . . . Which is no longer flat. It is like an L that maximally occupies space in the cube. This turns out to be the function (class) that's used in Wolfram's rule 110[2]. These two functions are in distinct classes (they difference is _essential_ as the former doesn't use one of the inputs, while the latter does). However, they have the same number of symmetries: 8 x 3 x 2 = 48 This follows from 8 different positions (the corners of the cube), 3 different orientations to fix one of the other 3 points at, and the inversion of the pattern. This might not be a coincidence. Are they maybe lines and planes of a projective plane, thereby being dual? For 3-input, binary, 1D CAs, Wolfram identifies 88[3] fundamentally different rules. Why are there so many? For CAs the order and polarity matters a great deal. Due to recursive application, there is little more than left-right symmetry (256->128) + some symmetry in degenerate cases. Anyways. How to look for such "interesting" functions automatically? Let's look at the interpretation of functions in this class. After all, this is simple logic, so there is natural-language intuition to connect to. _ __ _ _ AB AB AB AB C . x x . C x . . . (A & B & C) | ((~B & ~C)) | ~(B | C) Hmm not much there.. What practical predictions to make? Not more than hunches really. I suspect that a representative of this function will lead to an interesting modulation scheme for a 3-oscillator synth. This can probably be generalized to higer dimensional functions. Does giving up permutation give special status to the individual functions? This might also be usable in sound synthesis, where one of the modulators has a different function than the other sounds. [1] http://en.wikipedia.org/wiki/Karnaugh_map [2] http://mathworld.wolfram.com/Rule110.html [3] http://www.wolframscience.com/nksonline/page-57-text?firstview=1 Entry: Music and almost integers Date: Tue Jun 9 20:38:33 CEST 2009 An almost integer[1] is a number that is very close to an integer, but isn't an integer. These can be found in great number by applying trigoniometric functions to integers, ending up close to integers. (cos 22) -> -0.9999608263946371 != -1 This particular one is related to the approximation of PI as a ratio 22/7. Similar things can happen with exponentials of ratios ending up close to other simpler ratios. This phenomenon is very much related to tonal music, which is built on rational approximations to irrational numbers. Simultaneously sounding pure tones produce harmony, wich can be described as regular beat patterns. This happens when frequency ratios are ratios of small integers. This is then tied into melody by employing the ratios that make sense for simultaneously sounding tones to space out frequencies of tones intended to be played sequentially, creating scales. Melody is evolution of tone frequency over time, where repetition and proximity play an important role. This makes melody and harmony combine nicely yielding tonal music. However, this picture only works approximately. Somehow the human brain likes the kind of non-exactness. Some intervals just aren't that pure as others, and this can be used as an expressive device. I.e. a perfect fifth[2] has a ratio of 3/2. In an equal tempered scale this is approximated by 2^(7/12). Applying this ratio 12 times produces the circle of fifths[3] and ends up exactly 7 octaves higher. However, in just intonation this isn't quite so: (define (pow n x) (if (zero? n) 1 (* x (pow (sub1 n) x)))) (pow 3/2 12) -> 531441/4096 = 129.746337890625 != 2^7 = 128 This is solved by making some of the fifths that construct the scale less exact, favouring the fourth[4] (4/3) major third[5] (5/3) and minor third[6] (6/5) to build harmonicly sounding chords. [1] http://mathworld.wolfram.com/AlmostInteger.html [2] http://en.wikipedia.org/wiki/Perfect_fifth [3] http://en.wikipedia.org/wiki/Circle_of_fifths [4] http://en.wikipedia.org/wiki/Perfect_fourth [5] http://en.wikipedia.org/wiki/Major_third [6] http://en.wikipedia.org/wiki/Minor_third Entry: Filters in SD space. Date: Wed Jun 10 12:46:49 CEST 2009 In [1] it was shown that Sigma-Delta modulated signals can be attenuated and mixed using local operations as long as the energy never increases, and as long as signals remain independent. This post elaborates on the subject bringing back the concept of time. Filters To implement filters I see two possibilities: continuous time filters (integrators) and discrete time filters (based on time delay). Time-delay filters are possible with the current representation as long as independence of time-shifted signals is maintained. However, this might not be so straightforward to accomplish. Following my hunch I think it's better to go for the idea of directly modeling integration as an operation on the SD bitstream. Is it possible to re-introduce the concept of time? Note that uptil now we talked only about expected values of random variables, and ignored time completely, using some hand-waving that "short time averaging" turns the sequence back into an analog signal. With a concept of time it is probably possible to introduce the concept of integration, or at least that of a bandlimted integration (1-pole low pass filtering == an opamp), operating directly on an SD bitstream. Integration This operation is about re-introducing time in the binary sequence. Integration can no longer be expressed as the expected value of a stream, as that is a single number. Some clever way of turning the bit sequence back into a sequence of numbers (or a continuous function!) should lead to a way to express the integration operation. It seems to me that leaving the discrete domain as soon as possible is the way to approach the problem. The signal s(t) is hidden in the bitstream as follows: a(t) = \sum a_i p(t-i) = s(t) + n(t) Where p(t) is a square pulse with hight one, duration 1. What is the effect of integrating s(t) on a_i? It might be better to approach this problem from the other side. What can we do operating on the bitstream a_i to produce something that behaves like integration in the low band? Is there a way to do this without recoding? The key problem is that in order to be able to generate a bitstream, a local representation of the accumulated state is necessary, to be able to use this number to space out pulses. At first sight it does look like there is no way around a demodulation/process/modulation scheme. Additional Questions * It looks like noise is going to be the essential element to translate between the discrete (bitstream) and the continuous (statistics of the bitstream). * Is a square wave, edge triggered representation actually useful? Instead of sending out the carry bit as a binary signal, sending the carry bit as a state change generates a square wave frequency-encoded signal. This signal represents a full-scale square wave in the SD domain. * Can we represent complex multiplication as a means to introduce sine and cosine? The answer seems to be yes as long as magnitude is <1. Qantify this. * Since decorrelation is so important, can't we just use (uncompressable) information transmitted alongside the signal as an extra digital channel? * Since we're digital and thus error-less, can we maybe construct differential equations based on differentiation instead of integration? Does this make sense at all? * Maybe the only question to ask is: Is it possible to implement differential equations using a S-D integrator and negative feedback? Is such a thing stable? Both in the digital domain (non-linear limit cycles) and the statistical domain. [1] entry://20090609-111303 Entry: SD and edge-triggered representation Date: Wed Jun 10 15:35:37 CEST 2009 What is nice about the SD form is the ease at which it can be integrated into mixed analog/digital electronics. In fact, it can be attached directly to a set of speakers, or any analog circuit with low-pass characteristic. However, I wonder if moving to an edge-only representation will make things simpler. This will give a pure FM representation of a signal: value directly proportional to square wave frequency. It eliminates an element from the data representation: the width of a pulse. This is essentially an arbitrary component and can be easily reconstructed later. The important question is: does it make things really simpler? Let's re-interpret the elementary operations explored in [1] in terms of square wave signal. FM signals are trivially generated from an accumulator as the accumulator's MSB. Converting an FM signal to analog can be done in many ways [2], but the simplest ways seem to convert the edge signal back to a pulse signal either unclocked using a one-shot, or clocked using a bit delay. Since a signal's value is directly proportional to the density of transitions, addition is represented by the XOR operation. The advantage here is that addition is "dumb". There is no gating required as in the pulsed case. This seems simpler, but just as the pulse representation, this might lead to loss of information due to aliasing. However, the form of aliasing is different and consists of glitches. Proper "de-glitching" needs some clocked logic, and can only be done exactly if there is room in the output signal (if signals don't change during de-glitching pulse width). This brings us back to the complexity of dealing with bit slots as in the SD case. Attenuation could be implemented statistically as "selectively forgetting" to toggle the switch. There isn't much difference here. So, it looks like most operations can again be implemented using simple logic gates, conditionally on the uncorrelated nature of the input signal. I'm not really convinced about this since it seems to be harder to qualify: the extra work that needs to be done in a clocked/pulsed representation seems to bring at least some order that is harder to reconstruct in the FM case. The FM case however is probably simpler to interface to special puropose (non-clocked) analog circuits. Conclusion: FM representation is in some respects simpler, but harder to control using statistical time-multiplexing tricks as described in [1]. Since conversion between the two formats is rather straightforward in clocked logic, it seems that sticking with SD is best. Then FM can be used when it is convenient, i.e. when interfacing to an unclocked or asynchronously clocked circuit. [1] entry://20090609-111303 [2] http://en.wikipedia.org/wiki/Demodulation#FM_radio Entry: Using finite fields for music Date: Wed Jun 10 18:57:46 CEST 2009 One of those ideas that disappeared into the background.. Let's try again. A finite field[1] or Galois field is a field[2] with p^n elements, where p is a prime number. There is only one GF(p^n) upto isomorphism. The field p^n is the field of polynomials over p, modulo a reducable polynomial of order n. For music applications, it is probably possible to look at Galois fields in the same way as one would use the complex number field to represent sum-of-sines. Multiplicative cycles can be used as different rhytmic/harmonic components, and each element of the field could be associated to an instrument, or a combination of. For music we need the numbers 2, 3, 5 and 7 in copious quantities. Larger prime numbers are less interesting from a harmony/rhythm point of view. Given that we're looking for a certain combination of small prime numbers in the factorization of the multiplicative group, we can start typing things into a calculator as see if we end up at prime powers. 2^a 3^b 5^c 7^d = p^n - 1 A more systematic way is probably in order, i.e. generating[5] all power of primes below a certain number and investigating the factors of the multiplicative group order. However, to illustrate the idea let's just pick a few starting from the factorization. The first candidate is GF(211) which fits nicely in 8 bits. 2^2 . 3 . 5 . 7 = 211 - 1 Another one that fits in 8 bits is GF(241). 2^4 . 3 . 5 = 241 - 1 The question is: how to make this interesting? We can generate periodic sequences of numbers using formulas like: x_n = \sum a_i g_i^n where g_i are generators of cyclic subgroups, and a_i are arbitrary field elements. The first question is: given a particular representation of a GF, how many such sequences exist? Suppose we associate a certain configuration of drum sounds to each element in the field. How many patterns can be generated? Are they at all interesting? What about maps from the elements to a smaller number of drums? Maybe combinations of drums can be associated to terms in a polynomial. Is there an interesting way to combine a non-prime field with cycles this way? Maybe the interesting stuff lies in the symmetries of a field[7]? I.e. start with a simple rhythm. Find a (small) field to embed it in, then find variations of this by looking at the field's automorphisms. How can a GF be visualized? The structure of the additive group is quite straightforward: for GF(p^n) it is n times p cycles. How can the multiplicative group be shown on top of this? There's a visualization of GF(3^2) here[3]. Visualizing the the group as actions on a set seems like a good idea. Looks like a nice application for graph visualization[4]. Intuitively it seems that groups with small p and n in the same order of magnitude, will contain a more interesting structure. Instead of a single flat space (p^1) the structure is a vector space where each dimension has a separate entitiy. It allows the representation of instruments/notes (q) that can sound together in chords(n). I.e. 5 tones, 3 note-chords gives 5^3=15 with a multiplicative group of 14 = 2 . 7 elements. More concisely: order factorization add mul ---------------------------- 15 5^3 2 . 7 21 7^3 2 . 2 . 5 As a side note, maybe Galois Theory[7] is interesting to study in its own right, as it is connected to polynomials which are next to matrices the bread and butter of numerical math. [1] http://en.wikipedia.org/wiki/Finite_field [2] http://en.wikipedia.org/wiki/Field_(mathematics) [3] http://finitegeometry.org/sc/9/3x3.html [4] http://en.wikipedia.org/wiki/Graph_drawing [5] http://www.research.att.com/~njas/sequences/A000961 [6] http://en.wikipedia.org/wiki/Cyclic_group [7] http://en.wikipedia.org/wiki/Galois_theory Entry: Binary Sound Synthesis Date: Wed Jun 17 19:22:43 CEST 2009 Type: tex This is an account of a journey into ``Binary Sound Synthesis'' (BSS), which started early 2005 as a test application\footnote{The original synthesizer ran early 2005 on a PIC12F628, an 8 bit microcontroller with 6 i/o pins which I had configured to run at 1 MIPS. The current one runs since mid 2006 on a PIC18F1220, an 8 bit micro with 18 pins and a slightly more powerful ISA. The original synth is a synchronous one running at $8$kHz. At each sampling point the output state of $3$ oscillators is determined. The control updates run at $200$Hz, while the note updates run at $8$Hz, which is $1/4$ note at $120$bpm. The current one uses 3 asynchronous hardware timers, thus a higher maximal rate of $2$MHz. In a nutshell, the synth contains 2 main algorithms: cross modulation (XOR mixer), a formant synth (AND/OR mixer), LFSR pseudo-random sequence generator. It uses 3 oscillators that can be synchronized. Most of the interesting parts are in the controller that reconfigures the oscillators.} for a Forth compiler for the 8--bit Microchip PIC architectures. With BSS is meant the process of generating audible sound from digital square--wave signals using an absolute minimum of logic or code. This is old stuff. The digital approach dates from the era of early 8--bit game machines, and as such a lot of the algorithms are no longer used. Currently there are few reasons to not use floating or fixed point math with PCM outputs. The point of this paper is partly to archive and document old techniques, and to shine an idiosyncratic light on the matter. I find this a facinating subject. Probably because it is so different from the standard real and complex number based signal processing. In BSS the main focus is on the time component, since it is the timing between switching a speaker on and off which becomes the only means of controlling the sound. \section{Cycles} The simplest sounds we can produce are based on periodic square waveforms. An oscillator can be can be implemented using a frequency divider or pulse counter, which generates one output event for each $k$ input events. Multiple such oscillators can then be connected to a hardware timer interrupt. Mathematically, this can be related to addition in the ring of integers modulo $k$. We'll denote this algebraic structure as $\ZZ_k$, the cyclic group of order $k$. This section talks about such groups, as they will appear as multiplicative groups of Galois Fields. \subsection{Counters and Cyclic Groups} Cyclic groups are groups, necessarily abelian, that can be generated by a single element. In our canonical representation, this element will be $1$ in $\ZZ_k$. If $k+1$ is a power of a prime, it is the multiplicative group of the Galois Field $\GF(k+1)$, which we will encounter later. All abelian groups are composed of direct products of prime order cyclic groups, but not all abelian groups are cyclic. The prototypical example being the Klein $4$--group $\ZZ_2 \times \ZZ_2$. The exact relation between a group and a counter is as follows. We use additive notiation because all the groups are abelian. To each counter we associate a group $\G$, a generator $g\in\G$, and a state $s\in\G$. On each input event, the state $s$ will be replaced with $s+g$. Whenever the state reaches $s=0$, the unit element of the group, an output event is generated. The division factor is the order of $g$, which is the smallest integer $o$ for which $\underbrace{g+g+\ldots+g}_{o \text{ times}} = 0$. For example in $\ZZ_{6}$ where $\{0,6,12,\ldots\}$ denote the same element, the generator $1$ has order $6$, the generator $2$ has order $3$ and the generator $3$ has order $2$. \subsection{Composite Numbers} Groups with prime order are not so interesting to us, they generate only one kind of cycle. What interests us most is the combination of timers. Let's have a look at the example of highly degenerate groups, which have a number of elements equal to a composite number like the number of seconds in a week $604800 = 2^7 3^3 5^2 7$. Note that from the size of an arbitrary group we can't necessarily deduce its structure. However, requiring that the group is cyclic is enough, since there's only one of a given size. This makes cyclic groups look a lot like positive integers. A cyclic group can be constructed explicitly as a product of cyclic groups of prime power order, ensuring the component groups have orders that are mutually coprime. This is again analogous to how positive integers behave. A point of difference, however, is that groups of prime power order are not a product of smaller cycles, but they do contain all smaller prime power cyclic groups as subgroups. The number above gives the group $$\ZZ_{2^7} \times \ZZ_{3^3} \times \ZZ_{5^2} \times \ZZ_{7},$$ which consists of $4$--tuples with elements from the respective groups. This group is essentially the same as $\ZZ_{2^7 3^3 5^2 7}$. I'll denote the group order as $$\#\G=\prod_{n=1}^{N} p_n ^ {m_n},$$ and $\G$ as the abstract group, where $\ZZ_{\#\G}$ and $\times_{n=1}^{N} \ZZ_{p_n^{m_n}}$ are two isomorphic representations. In our representation of $\G$, the cyclic subgroup of order $q_n$ in $\G$ can be generated from the element $$g_n = {\#\G \over p_n^{m_n}},$$ giving a homomorphism from $\G$ to $\ZZ_{p_n^{m_n}}$. Combining $N$ such homomorphisms allows the construction of an isomorphism $$f : \times_n Z_{p_n^{m_n}} \to Z_{\#G} : (x_1, \ldots, x_N) \mapsto \sum_n x_n g_n.$$ This leads us to the more concrete observation that an oscillator based on $\G$ can be implemented either as a single counter $\ZZ_{\#G}$, using a single generator, or a ``wired or'' of $N$ counters $\ZZ_{p_n^{m_n}}$, each with its own generator, where the whole acts as a single frequency divider which only generates an event if all counters have $s_n=0$. \subsection{Musical Scale} This makes one wonder if it's not possible to construct a group that is very degenerate, in a way that it produces a relatively well spread out ``frequency spectrum'' which maps well to the (logarithmic) frequency resolution of human hearing. Keeping the number of large prime factors small, it might even be possible to create some kind of musical scale with inherent ``smoothness''. Probably, if there are enough small primes, large prime numbers are not really necessary since frequency resolution for large numbers is less of a problem. For example $d = 604800 = 2^7 3^3 5^2 7$, which is the number of seconds in a week. Moreover, $d+1$ is prime, so it has an associated Galois field, but this is no longer a polynomial field, In order to investigate the usefulness of this, we could plot out the possible frequencies that can be generated using this scheme on a logarithmic scale. On the other hand, using fields might be overkill. Using just cyclic groups (counters) might be better here. Given a a degenerate cyclic group $\G$, how many different (cyclic) subgroups does it have? All subgroups of cyclic groups are cyclic, so there is a one to one correspondence of the order of all possible combinations of subgroups of $\G$ and the combinations of prime factors of $\#\G$. Which leads us to the number of possible groups $$\prod_N (m_n+1),$$ since the subgroups of $\ZZ_{p^m}$ are $\{\ZZ_{1}, \ZZ_{p}, \ldots, \ZZ_{p^m}\}$, which totals $m+1$. The subgroups, or numbers representing the orders, can be arranged in a $N$ dimensional cuboid, addressed with coordinates $(x_1,\ldots,x_N)$ where each coordinate is limited to $0 \leq x_n \leq m_n$. The order can be computed as $$\#\G(X) = \#\G(x_1, \ldots, x_N) = \prod^N p_n^{x_n},$$ which corresponds to the the subgroups $\G(X) = \times^N \ZZ_{p_n^{x_n}}$ of $\G$ The things we are interested in is the distribution of $\G(X)^{-1}$, since it represents the number of frequences we can generate by dividing a master clock, which would be the CPU clock for example. All frequencies that can be generated in this way have a fairly simple harmonic relationship, making it possible to generate smooth scales and chords. \subsection{Playing with Subgroups of $\ZZ_{p^m}$} Given a divider $\ZZ_{2^m}$, obtaining one for $\ZZ_{2^{m'}}$ with $m' exponential % what about matrices over GF(2)? what's to find there? % most theorems in linear algebra / projective geometry % exclude GF(2) as a base field, because it behaves so differently. %% \subsection{Matrix Representation} %% Over $\GF(2)$ polynomials can be represented by (infinte) matrices, %% while polynomial fields can be represented by (finite) matrices. \subsection{Application : Linear Noise Generators} The cheapest way to generate noise for the application at hand is to use exactly the impulse response of linear systems described by difference equations as explained above. If the period of the bit sequence is large enough, the human auditory pattern detection will not be able to recover the redundancy. Following the same algebraic trick as exposed above, suppose the difference equation is expressed by $p(x)y(x) = u(x)$, and $u(x) = 1$, corresponding to the impulse sequence $(1000\ldots)$, what we need to do is to find out which period is related to $p(x)$. In other words, we have to find the $s_p(x)$ with the lowest possible degree such that $${1 \over p(x)} = {s_p(x) \over x^P + 1} = s_p(x) \sum_n x^{Pn} ,$$ where $P$ is the period of the bit sequence corresponding to $p(x)^{-1}$. This is equivalent to the congruence relation $$x^P = 1 \mod p(x).$$ To get the smallest $P$ satisfying this equation as large as possible, it is necessary to choose $p(x)$ to be a \emph{primitive} polynomial, which means it is irreducible over $\GF(2)[x]$, and $x^n\mod p(x)$ generates all nonzero elements in the field modulo $p(x)$, also denoted as $\GF(2^n)$ with $n$ the degree of $p$. %% In the quotient field $\GF(2)[x] / (p(x))$, the %% element $x + (p(x))$ generates the entire multiplicative group. Here %% $(p(x))$ denotes the ideal in $\GF(2)[x]$ generated by $p(x)$. The %% multiplication is commutative, so it is always possible to construct %% such a generator. Whether this generator is mapped to $x$ depends on %% the polynomial $p(x)$. The direct implementation of the difference equation is called a \emph{Linear feedback shift register} or LFSR. The next output is computed as the sum of a certain number of taps. I originally used a $16$bit LFSR with taps $(15,14,12,3)$, but the parallel variant is simpler to implement given a parallel XOR operation is available. This parallel variant is related more directly to the Galois field $\GF(2^n)$. More specificly, the output of the generator is taken to be any coefficient of the polynomial sequence generated by $x$, or $\sum_k s_{k,n}x^k = s(x)_n = x^n \mod p(x)$, where the output sequence could be $s_{0,n}$. To illustrate the difference between an irreducible and a primitive polynomial, take the irreducible polynomials $x^8+x^4+x^3+x^2+1$ and $x^8+x^4+x^3+x+1$. Both are irreducible, but only the former produces a primitive field. In the field of polynomials modulo the latter one, the $x+1$ is a generator, but not $x$. Note that these fields are isomorphic, or essentially equal, but their representations are different. Implementing the parallel scheme for the former polynomial on a digital computer is straightforward. Multiplication of the state polynomial $s(x)$ by $x$ corresponds to a left shift of the register containing the coefficient bits. The bit that's shifted off (the coefficient of $x^8$) is replaced by adding the other terms $x^4+x^3+x^2+1$ to the current state. In other words, when a bit carries over, \verb+1Dh+ $=$ \verb+00011101b+ is XORed with the state. %% \subsection{Using a multiplier} %% When a multiplier is present, a lot more intersting things can be %% done. One of them is the use of the linear congruential method for %% random number generation. %% \section{Resurrection} %% Digging on the web. \section{Other Discrete Structures} %\subsection{Sampling and Formants} % \subsection{Diadic Numbers} \subsection{Phase and Frequency Modulation} What happens when we modulate the generators of an additive subgroup, or period of counters? Let's call the first \emph{phase} modulation and the second \emph{frequency} modulation. Using an oscillator with period $M$ and generator $1$ to generate the generator of a second oscillator $F$ gives the following function $$F_n = \sum_{i=0}^n i = {n(n-1) \over 2}\mod F.$$ \subsection{Cellular Automata} Another interesting way to generate signals is by using cellular automata. Using the ideas above, we could represent a finite row of nonzero cells as a polynomial, and the update function as another polynomial. Giving rise to an update function in the ring $\GF(2)[x]$ or any finite field $\GF(2)[x] / (p(x))$. \subsection{The NNT and the Fields $\GF(2^{2^n}+1)$} For $0 \leq k \leq 4$ the Fermat numbers $F_k = 2^{2^k}+1$ are prime, which means there exists a primitive field $\F_k = \GF(F_k)$. This field is isomorphic to $\mathbb{Z} / F_k \mathbb{Z}$ the integers modulo $F_k$. Implementing this representation requires the implementation of a modulo operation. This can be quite expensive, especially on cheap hardware lacking a fast hardware divider. However, it is possible to embed a representation of $\F_k$ in the ring $\R_{k+1}$, where we take $\R_k$ to be the ring represented by $\mathbb{Z} M_{k} / \mathbb{Z}$, the integers modulo $M_{k} = 2^{2^k}-1$, where the $\{M_k : k \in \mathbb{N}\}$ form a subset of the set of Mersenne numbers $\{2^n-1 : n \in \mathbb{N}\}$. The implementation of this ring relies on the reduction modulo $M_k$, which is a cheap operation if $K=2^k$ is (a multiple of) the machine wordsize. Suppose we want to reduce a number $[c:a]$ of $2K$ bits modulo $M_k$, represented by the concatenation of two $K$ bit words $c$ and $a$. This gives \begin{align*} [c:a] &= 2^Kc+a \\ &= (2^K-1)c+c+a \\ &= c+a \end{align*} with operations modulo $M_k$. The consequence of this is that both addition and multiplication in $\R_{k}$ can be performed by the default finite wordlenght unsigned integer operations, followed by this very simple reduction step. This reduction is just addition due to the availability of $c$ as a separately addressable entity, namely carry bit for unsigned addition and high word for unsigned multiplication. So how to embed $\F_k$ in $\R_{k+1}$? The map $$f_k : \R_{k+1} \to (2^{2^k} - 1)\R_{k+1} : x \mapsto x 2^{2^k - 1} (2^{2^k}-1) = x 1_k$$ will do just that. Here $$(2^{2^k} - 1)\R_{k+1} = \{(2^{2^k} - 1)m : m \in \R_{k+1}\},$$ which we will call $I_k$ is the ideal generated by $2^{2^k} - 1$ in the ring $\R_{k+1}$. The element $$1_k = 2^{2^k - 1}(2^{2^k}-1)$$ is the representative in $\R_{k+1}$ of the multiplicative unit of $\F_k$. It can be used to generate the additive group, which gives us a clear relation between integers and the representation in $\R_{k+1}$ of their associated elements in $\F_k$. So how can we see this is a representation of $\F_k$? It is easy to see that $f_k$ is a group homomorphism for the additive group of $\R_{k+1}$ since $f_k(a+b) = (a+b)1_k = a1_k + b1_k = f_k(a) + f_k(b)$. To see that $f_k$ is a ring homomorphism, rests us to show that $f_k$ is a group homomorphism for the multiplicative group. We need $f_k(a) f_k(b) = a b (1_k)^2 = a b 1_k = f_k(ab)$ which is valid if $1_k^2 = 1_k$. That this is so can easily be verified. %% This is so since %% \begin{align*} %% 1_k^2 %% &= (2^{2^k - 1} (2^{2^k}-1))^2 \\ %% &= (2^{2^{k+1} - 2}) (2^{2^{k+1}} - 2^{2^k + 1} + 1) \\ %% &= (2^{2^{k+1} - 2}) (- 2^{2^k + 1} + 2) \\ %% &= 2^{2^k} (2^{2^{k} - 2}) 2 (-2^{2^k} + 1) \\ %% &= (2^{2^{k} - 1}) (-2^{2^{k+1}} + 2^{2^k}) \\ %% &= (2^{2^{k} - 1}) (-1 + 2^{2^k}) \\ %% &= 1_k %% \end{align*} %% with all operations in $\R_{k+1}$. Note there are no divisions here, %% only multiplication and modulo reduction using $2^{2^{k+1}} = 1$. The kernel $\{x : f_k(x)=0\}$ of this ring homomorphism is $\{xF_k : x\in\R_{k+1}\}$, which is a maximal ideal of $\R_{k+1}$, which means that $\R_{k+1} / \{xF_k\}$ is a field. This structure is carried into the ideal $M_k\R_{k+1}$ by $f$. To find this representation by construction see \verb+mersenne.tex+, which uses arguments to identify $1_k$ in the additive subgroup generated by $M_k$, by requiring the property $1_k^j = 1_k$, for $0 \leq j \leq F_k$, \subsection{Shape of $1_k$} Returning to the embeddings of $\F_{k}$ in $\R_{k+1}$, the interesting thing what $1_k$ exactly looks like. \begin{align*} 1_0 &= \0\1 \\ 1_1 &= \0\1\1\0 \\ 1_2 &= \0\1\1\1.\1\0\0\0 \\ 1_3 &= \0\1\1\1.\1\1\1\1.\1\0\0\0.\0\0\0\0 \\ 1_4 &= \0\1\1\1.\1\1\1\1.\1\1\1\1.\1\1\1\1.\1\0\0\0.\0\0\0\0.\0\0\0\0.\0\0\0\0 \end{align*} Here the dots just separates nibbles. This looks like a nice square wave! Another interesting thing to see is that in $\R_{k+1}$, there is only one silence, which is all zero, since it is the same element as all one. What do the other numbers in the ideal $M_k \R_{k+1}$ look like? The redundancy is $M_k : 1$, which gets large quite fast. Are the waveforms special in some way so we can use them as sounds? \subsection{Resonant Filter} Using two state output, there is really no straightforward way to add two signals, performing additive synthesis. Composing a signal consisting of different frequencies by adding them together modulo 2 behaves more like cross modulation than addition. The audible spectrum will contain not only the sum, but also the modulation components. However, it is possible to approximate $2$ tone polyphony by synthesizing waveforms that have two clearly distinct harmonic components. The approach is based on the following observation. The sequence $$\1\0\1\0\1\0\0\0\0\0\0\0\1\0\1\0\1\0\0\0\0\0\0\0 \ldots$$ resembles the response of a resonant filter with period $2$ and time constant $6$, to an impulse signal with period $12$, while the sequence $$\1\0\1\0\1\0\1\1\1\1\1\1\0\1\0\1\0\1\0\0\0\0\0\0 \ldots$$ resembles a response of the same filter to a square wave with period $24$. It is even so that this particular scenario can be expressed in terms of generating functions, when the resonant filter is represented by a polynomial (finite), stretching the analogy even further. This system has $3$ degrees of freedom: the periodicity, the resonance frequency and the time constant. This can be implemented as the product (AND) of a pulse width modulated low frequency signal oscillator, and a high frequency oscillator which has its phase reset for each $0 \to 1$ transition of the low frequency signal. It is probably possible to combine two of those to get two formants which is enough to encode very basic vowel spectra. This requires 3 timers, which fits perfectly in the available hardware. % distorted walky talky ?? \subsection{Chaotic Oscillator} An emulation of the qualitative behaviour of a chaotic oscillator can be obtained by driving the Resonant Filter system described above with a base period that is irregular. \subsection{Pulse Width Modulation} Now, this is sort of cheating, since the purpose of PWM is really efficient switched digital to analog conversion. Using it, we give up the ``true'' binary output, and add a layer that will allow us to vary the voltage (on average) between more than 2 levels. The \verb+18f1220+ chips contain a hardware PWM module with $8$ bit resolution ($10$ bit for the enhanced version). A fixed period oscillator is used as the timing source to drive a pin high and low with a programmable duty cycle. I do wonder wether it's possible or desirable to do some computations in $\GF(257)$. I don't see much however, other than convolution using NTT. \def\sd{\Sigma\Delta} \subsection{$\sd$ Modulation} Note that if switching efficiency isn't all important, $\sd$ modulation gives much better performance for 1--bit audio digital to analog conversion. It generates higher frequency waveforms, but better low frequency resolution. It can also be used as a cheap way to do synthesis, since it behaves approximately as a reverse frequency to voltage converter: output pulse frequency is on average proportional to the input value. With multibit digital input, it is very easy to implement: the binary $\sd$ signal is the carry bit of a simple input accumulator. When the accumulator overflows, one ``packet'' of energy is transferred to the output. \subsection{Gray Code} Gray codes are (cyclic) sequences of $N$ binary digits, where each number in the sequence differs only by one bit from the previous. They can be understood as \emph{Hamiltonian paths} on a $N$--dimensional hypercube, where one moves along edges of the cube. However, the most commonly used gray code is one with a particular kind of symmetry. It is called the \emph{binary reflected} grey code or BRGC, which is a recursive scheme that traverses both $N-1$ dimensional halves of the hypercube in the opposite direction. This rule is applied recursively and terminates at the trivial $1$ dimensional case. If $\bar{s}$ is the reversal of the sequence $s$, $s : t$ is the concatenation of sequences $s$ and $t$, and $b(s)$ prepends each bit vector in the sequence $s$ with bit $b$, the BRGC sequence recursion relation is $$s_{n+1} = 0(s_n) : 1(\bar{s}_n),$$ where $s_0$ is the length $1$ sequence containing the empty vector. % the wikipedia page about this is unclear, since a 4D, 16 cycle coil is a BRGC.. maybe it's the ``constrained'' part? %% Gray codes are related to the \emph{snake in a box} problem, ore more %% specifically, the \emph{coil in a box} problem. % interesting: single track codes \section{Control and Note Sequencing} This is a whole other ballpark. With \emph{control} is meant the evolution of the configuration of the synth engine at a slower rate. This is called \emph{haptic} rate and is related to our ability to experience mechanic vibrations. The next rate down is \emph{note} rate, and roughly corresponds to the frequency of events that can be experienced visually. This cascade of time scales is called the \emph{frequency hierarchy}. The lower frequency parts are much less limited by the speed and simplicity of the hardware, since more time can be used to do actual computations. This makes it rather pointless to try to exhaust the possibilities. However, here are some general ideas. \subsection{Transient Stack} In a binary synth there is no direct way to mix sounds. One way of bringing discrete events to the forground is by temporarily interrupting stable background sounds. The typical example would be to synthesize a sequence with \emph{bass}, \emph{bassdrum}, \emph{snare} and \emph{hihat}, which are ordered here left to right with time extent decreasing. Mixing the 4 will boil down to playing the shortest sound until it is done. \subsection{Wavetable} Since there's a wavetable player, it is possible to do some control rate timbre synthesis by generating or updating wavetables. There are a lot of possibilities here. Maybe the most interesting parts are transform domain methods like the \emph{Walsh--Hadamard} transform. \subsection{Chirps} A simple form of frequency modulation using a linear ramp or a sawtooth signal is easily implemented. This can be used to generate drum-like sounds. \section{Design and Implementation} \subsection{Frequency Hierarchy} Human perception already seems to have a built in time hierarchy roughly ordered from high to low as \emph{hearing}, \emph{feeling}, \emph{seeing}, \emph{moving}, \emph{remembering}. This is just a vague description, since a lot of these time scales overlap. However, this idea of time hierarchy can easily be reflected in the way we keep time in the synth, using the concepts of \emph{sample}, \emph{control} and \emph{note} frequency. This is fairly standard practice. We're going a step further in that each of these levels has an associated language, which metaprograms the previous one. This is possible since efficiency can be traded for expressive power as we move up the time scale ladder. \begin{itemize} \item[N)] Note state updates switch control engines. \item[C)] Control state updates switch synth engines, synth parameters and signal routing. \item[S)] Synth updates produce waveform samples. \end{itemize} It's probably easiest to have the low levels pre--empt the higher ones, so computations can be spread out without explicit cooperative multitasking. There's no reason to not have cooperative multitasking on the high abstraction layer though. %% The first layer's language is plain data: all synth engines are fixed, %% only the current routine and engine, and it's parameters can be %% changes. The second layer's language is threaded forth. Current %% control word is any forth word, but actual code is still in flash %% memory. The third layer's language is a music language. \subsection{Synth Architectures} \begin{itemize} \item An asynchronous design using the $3$ high resolution timers to generate events, and the low resolution timer to drive the noise generator and the other time bases. \item A synchronous design with a software synth engine running at a fixed sample rate, from which all other time scales can be derived. \item A similar synchronous design, but using $10$ bit hardware PWM as output. \end{itemize} \subsection{Synth Patch} The idea is to make the synth as configurable as possible, but at the same time keeping the configurability contained in a small number of bits so the transient stack can be implemented efficiently. The way to do this is to eliminate all redundancy (symmetry) in the configuration data. Also the decoding of this information should be very straightforward, since it has to be computed at each synchronous or asynchronous engine event. I'll call the total configuration state, not including the engine state, as the \emph{synth patch}. The synth engine consists of 4 \emph{generators}. One noise generator \verb+[noise]+ and 3 square wave oscillators \verb+[osc0]+, \verb+[osc1]+ and \verb+[osc2]+. The oscillators and the noise generator each have an associated update period, which is part of the patch data. There state is not. These oscillators can be synced to each other as single master sync $0 \to 1,2$ or cascaded sync $0 \to 1 \to 2$. This is encoded as one bit for \verb+[osc1]+ and two for \verb+[osc2]+, leaving room for one external sync event. I'd like to parametrize the output combination of the 3 square wave oscillators and the noise generator using a limited set of boolean functions. For $4$ inputs, there are $2^{2^4}=2^{16}$ such functions. To see this, draw the Karnaugh map of a $4$ input function, and observe it has $16$ squares in total. Some things that could be divided out are: the polarity of the output, the polarity of the noise signal and the polarity of each of the input square waves. Working it from the other side, the necessary combinations to get the absolute minumum functionality are \begin{itemize} \item \verb+[xmod]+ Add (XOR) the output of 3 square wave outputs. This can be used to generate PWM. \item \verb+[reso]+ Multiply (AND) \verb+[osc1]+ and \verb+[osc2]+, and add (AND) with \verb+osc0+. \item \verb+[gate]+ Multiply (AND) the output of 3 square wave outputs. \end{itemize} % is da wel waar? \subsection{Synth Algorithms} Currently, there's only $3$. Square waves, pseudorandom noise and sample playback. Together with intermodulation schemes described above, this gives already quite a rich palette. \subsection{Misc Hacks} So evaluate $AND$ of a number of bits in a byte, mask out all the bits that are not necessary, and use the $ADD$ operation. This can also be used to evaluate expressions of the form $f = a + bc$, which occur in the \verb+[reso]+ mixer. Here I combine 3 waveforms as $f = a + bc$. Using ordinary adder logic this gives $(a,b,c) + (0,0,1) = (f,x,x)$. Adding more bits allows to propagate the result into the carry flag. Rotating oscillator period's bit rep. \section{Electronics} I think a lot can be done by combining analog and digital electronics. One of the prototypical examples is of course the analog time varying reso filter. Some other interesting things could be done by using comparator based chaotic oscillators (switched unstable systems) and time constant capturing. Entry: LFSRs Date: Thu Jun 18 11:27:53 CEST 2009 Type: tex Looking at how linear feedback shift registers (LFSRs) are implemented, one can notice that the tap coefficients feeding into the XOR have to be from an irreducible polynomial in the polynomial ring $\GF(2)[x]$. Why is this? Suppose we have polynomials modulo $x^2+x+1$. These polynomials form the Galois field $\GF(2^2)$. An LFSR shifts, which means it takes a polynomial $p(x)=p_0+p_1x$ representing the current $2$--bit state vector, and multiplies it by $x$. After shifting, the highest order term $p_1x^2$ needs to be folded back into the representation by eliminating $x^2$ using the equation $x^2 + x + 1= 0$, which follows from modulo arithmetic. Hence, LFSR is shift followed by conditional XOR, which means addition of $x+1$ if $p_1 = 1$. So, the shift represents multiplication by the generator polynomial $x$ which produces a sequence of all elements of the field's cyclic multiplicative group. The XOR computes the modulo operation after multiplication, and is there to keep the representation contained in the minimal amount of bits. %[1] http://en.wikipedia.org/wiki/LFSR Entry: Topics in Algebra Date: Thu Jun 18 15:57:54 CEST 2009 Type: tex This is a cheat sheet summary of ``Topics in Algebra'', by I. N. Herstein. \section{Groups} \begin{itemize} \item {[Order]} The order $o(G)$ of a (finite) group $G$ is the number of elements it contains. The order $o(a)$ of an element $a$ is the least positive integer $m$ such that $a^m = 1$. \item{[Right Coset]} If $H$ is a subgroup of $G$, we call the distinct sets $Hg$ with $g \in G$ the right cosets of $H$ in $G$. The subgroup $H$ together with its right cosets $Hg$ is a partition of $G$. \item{[Lagrange's Theorem]} For $H$ a subgroup of $G$, $i = o(G)/o(H) \in \mathbb{N}$. This is called the \emph{index}, or the number of right cosets. \item{[Normal Subgroup]} $gNg^{-1} = N$, $g \in G$. The right cosets of $N$ are also left cosets, since $gN = Ng$. \item{[Quotient group]} is the group formed by all the right cosets of a normal subgroup $N$ in $G$, denoted by $G/N$. \item{[Homomorphism]} The kernel of a homomorphism is a normal subgroup. Likewize, a normal subgroup defines a homomorphism. For a homomorphism of $G$ onto $\bar{G}$ with kernel $K$ we have $G/K \approx \bar{G}$. \item{[Cayley's Theorem]} Every group is isomorphic to a subgroup of the automorphism group $A(S)$, for some set $S$. \end{itemize} The key feature of normal subgroups, $Ng=gN$, enables the introduction of a group operation for the partition of cosets, because $NaNb = NNab = Nab$. % commutatieve groupen: cycli \section{Rings} \begin{itemize} \item {[Ideal]} An additive subgroup invariant under multiplication. \item {[Quotient ring]} The partition of additive cosets $R/U$ of an ideal $U$ has the structure of a ring, and is a homomorphic image of $R$. \item{[Maximal ideal]} is an ideal $M$ yielding a field as quotient ring $R/M$ for a commutative ring $R$. \end{itemize} The additive subgroup is always commutative, so the additive subgroup is trivially normal. The quotient for groups involves two groups, but for rings it involves a ring and an ideal. If the only ideals are $R$ and $\{0\}$, there the ring $R$ can no longer be simplified by applying homomorphisms and has the structure of a field. \section{Fields} \begin{itemize} \item{[Irreducible polynomial]} The splitting field $E$ of an irreducible polynomial $p(x) \in F[x]$ of degree $n$ has degree $n$ or $[E:F] = n$. As a corollary, the maximum degree of a splitting field of any polynomial of degree $n$ is $n!$, where the worst case is reached if after each extension step the polynomial has only one root, so the resulting polynomial is again irreducible. \item{[Unique splitting field]} The basis of galois theory lies in the theorem that relates the splitting fields $E$ and $E'$ of the isomorphic polynomials $f(x)$ and $f'(t)$, from two isomorphic polynomial rings $F[x]$ and $F[t]$, where the isomophism leaves the field $F$ invariant. The splitting fields $E$ and $E'$ are isomorphic with an isomorphism leaving $F$ invariant. Applying this to different splitting fields of the same polynomial, we find the splitting field's structure is unique. \end{itemize} Entry: Bandlimited Discontinuities with Infinite Response Date: Thu Jun 18 16:01:18 CEST 2009 Type: tex This post contains a paper about bandlimited synthesis, coming from an idea that sprouted somewhere in the summer of 2002, when I was re-iterating over basic DSP math involving IIR and FIR filters and the Z-transform. The remaining problems are the evaluation of exponential functions (complexity and precision), and the error analysis which is important in this case as the algorithm relies on cancelling, a technique which is generally avoided as it is noise-sensitive. However, since the framework is quite controlled, there might be some shortcuts to derive bounds. \begin{abstract} In this paper I discuss the use of a bank of damped sinusoidal oscillators to synthesize the audible band of wideband signals obtained as higher order integrals of impulse functions, i.e. step and ramp functions. The usual approach is to use FIR polyphase filtering. I show in this paper that using the corresponding IIR filtering scheme is possible due to the high amount of redundancy of the input data, which allows the use of analytical methods to construct a specialized algorithm. \end{abstract} \section{Introduction} Why are Infinite Impulse Response (IIR) filters are never used in sample rate converters? The answer is fairly simple. Finite Impulse Response (FIR) filters depend only on the input, while IIR filters depend on input and previous output. Using an FIR filter to smooth out a signal before downsampling is an efficient operation because the samples that are dropped by the downsampling operation simply never need to be computed. The approach is called \emph{polyphase filtering}. This does not work for IIR filters because their output values are computed \emph{recursively}, so discarding intermediate output values does not enable one to discard the computation, because the next computation depends on its result. Conclusion: while IIR filters are more efficient than FIR filters if no linear phase relation needs to be preserved, for sample rate conversion they are more expensive than their FIR counterparts because the polyphase simplification can not be made. End of story. Or not? \section{Analog Waveforms} In the early days of electronic sound synthesis, analog relaxation oscillator circuits were used to produce interesting broad--spectrum sounds like square waves, sawtooth waves, etc\ldots. These could then be used as inputs to time--varying filters to produce interesting sounds. Performing synthesis of these waveforms in the digital domain however is much less straightforward than one might assume at first. Naively approximating a waveform by shifting its transition points to the nearest sample points introduces jitter in the signal. This is quite audible at higher frequencies due to it's stable modulation pattern. I.e. for square waves the result is simplest to describe as a pulse width modulated square wave. Closer inspection of the analog circuit quickly indicates that the step of analog to digital conversion complicates the matter: the linear distortion caused by the Anti--Aliasing (AA) filter present in a physical system is an effect that cannot be ignored, and thus has to be synthesized. Traditionally, the effect of the AA filter is implemented as a polyphase FIR filter bank, which could be called the \emph{oversampled naive} approach. In this paper I will describe a solution for this problem using an IIR filter implemented as a parrallel bank of complex one--pole filters. The only assumption made is that the simulated analog waveforms are (integrals of) time shifted Dirac impulse signals. This will lead to an analytically tractable effect on the state of an IIR filter. This paper shows how to compute the state updates necessary to synthesize the impulse response at each sample point. \section{State Updates} Define the Dirac impulse at time $T$ as the generalized function $\delta_T(t)$ which exhibits the properties $\delta_T(t) = 0$ for $t \neq T$ and $$\int_{-\infty}^{+\infty} \delta_T(x) f(t-x) dx = f(T).$$ Define a 1--pole continuous time low--pass filter as the differential equation defining the function $y(t)$ in terms of the function $x(t)$ and the pole $s$ as $${d \over {d t}} y = s (x - y).$$ To see this is a low--pass filter, assume is the constant function $x(t) = x_0$. This gives the solution $y(t) = x_0$, meaning DC is passed. Assume we are given a model for the anti--aliasing lowpass filter mentioned in the introduction (obtained from your filter design method of choice). Using the partial fraction expansion of the transfer function of this filter, it can always be written as the sum of complex first order linear sections. This means the entire problem can be understood in terms of the simple first order case. This is what will lead to a tractable formula for the state update. \subsection{Impulse Input} What we are interested in is the function $y(t)$ evaluated at integer points, assuming for simplicity the sample rate is $1$. Denote the resulting sequence as $y[n]$. Compute the sequence by integrating the differential equation from $t$ to $t + 1$. $$y[n+1] = y[n] + s\int_n^{n+1} (x(t) - y(t)) dt. $$ What happens when we apply the filter to $x_0\delta_T(t)$ with $0 < T < 1$? There are 3 parts in the input: the impulse at time $T$, and the free--running before and after. The solution of the differential equation of the free system from $t_0$, obtained by setting the input $x(t)=0$, is $$ y(t) = y(t_0) e^{-s(t - t_0)}. $$ To simplify the following, denote $z = e^{-s}$. The impulse will just add the value $x_0$ to the state, so we have the complete state update $$y[1] = x_0 z^{1-T} + y[0] z,$$ which is a first order discrete IIR filter with a correction term depending on the location of the impulse. \subsection{Step Input} Signals composed of $\delta_T(t)$ have a very rich spectrum and thus need a strong anti--aliasing filter. Therefore it is interesting to look at piecewize constant inputs. These are closer to the waveforms that can be constructed using analog relaxation oscillator circuits. Working with these directly helps to relax requirements for the anti--aliasing filter, since these signals have less power in the higher frequencies. Here a similar approach is possible. A system with constant input $x(t) \equiv x_0$ can be obtained from the previous zero--input system by a change of variables. It has the solution $$ y(t) = x_0 + z^{t-t_0}(-x_0 + y(t_0)).$$ The state now no longer jumps instantaneously at transition points. Instead, the $x$ in the update equation changes when the input jumps to a new value. I.e. a transition from $x_0$ to $x_1$ at $0 < T < 1$ gives the state update $$y[1] = x_1 + (-x_1 + x_0)z^{1-T} + (- x_0 + y[0])z,$$ which is again a first order discrete IIR filter with a correction term. It's probably enough to stop here, and approximate the integrals of steps in the discrete domain. It won't make much difference for the specification of the anti--alias filter. \section{Exponentials} The core routine of the algorithm is the computation of a complex exponential $e^{tc}$ where $c$ is a filter pole and $0 < T <1$. The total cost is directly proportional to the frequency of the discontinuities, since there is one update per discontinuity. Note that it's probably best to separate the computation of the real and imaginary parts, so they can both have their own approximations. % At first glance this seems to be problematic, because the algorithm % resembles some random number generator. Really? The AA filter has a cutoff around the sampling frequency, so the poles $s$ are all over the unit disk. This means there is no general simplification possible for $Ts \approx 1$. This makes sense: if there would be such an approximation, the correction terms in the algorithm would be practically insignificant. Since the sample period and the filter poles are fixed, it's probably easiest to use the $N$ bit binary approximation $$e^{Ts} = z^T = \prod_{n=1}^N z^{T_n 2^{-n}},$$ where the $T_n$ are the binary digits of $T$. An alternative representation of this is $$\prod_{n=1}^N z^{(2T_n - 1) 2^{-n-1}} \prod_{n=1}^N z^{2^{-n-1}},$$ or using the constant $a = \prod_{n=1}^N z^{2^{-n-1}}$ and the relation $(2T_n-1) = (-1)^{T_n}$ this can be written more clearly as $$a\prod_{n=1}^N z^{(-1)^{T_n}2^{-n-1}}.$$ Using $N$ bits from $T$ reduces the perfect analytical solution to the oversampled naive approach mentioned before, with oversampling factor $2^N$. This seems to be the main precision trade--off point. Is it possible to update the computation? Suppose a constant input period has discontinuities spaced period $P'$ apart. The only thing we need is the fractional part $0 \leq P < 1$. Updating is indeed possible by using the constants $z^{P}$ and $z^{P-1}$, which are the \emph{forward} and \emph{backward} phase updates. The latter one is used when the phasor would wrap around. Note that these updates are not numerically stable, so they have to be reset from time to time using direct computations. \section{Resampling} This looks like a general purpose resampler for arbitrary time--varying input frequencies, which can be derived directly from it by replacing the analog input signal by a PCM waveform. My main question is, if it can be used as such, why didn't I hear about it before? Suppose the deal is to build an $N$ times downconverter. I do this again by stimulating a bank of parallel complex one--pole filters, computing the state update for each one--pole. In the case where the input signal is an oversampled pulse amplitude modulated signal $$x(t) = \sum_n a_n \delta(t - n/N),$$ the state update during one sample period is $$y(t+1) = a_{N-1} + p(a_{N-2} + \ldots p(a_0 + y(t)) \ldots ),$$ with $p = z^{-N} = \exp(-cN)$. This is a polynomial in $p$. How is this different from just running a discrete filter at a the $N$ times higher sample rate, and discarding the intermediate samples? It is in fact exactly the same, since each of the parenthized expressions of the form $p(a_i + y)$ is exactly the update equation of a discrete one--pole IIR filter. This shows that in fact, this \emph{is} indeed a general purpose sample rate converter, but it doesn't really buy anything new when the $a_i$ are not structured in a way that enables the fast evaluation of the state update. \section{Implementation Hacks} The algorithm described above is only a small part of the entire problem. This section collects some remarks about how one might go about implementing a virtual analog waveform synthesizer. \subsection{Filter design} The way we presented the problem requires us to design a continuous time (analog) filter with desired anti--aliasing characteristics. The simplest approach is to use a maximally flat filter. Rumor has it that this is best for musical applications due to the way it leaves low--frequency signals relatively untouched. If you don't like rumors, anything goes: filter design is a black art, and for the application at hand, it is a relatively independent part, and can be done off--line, with virtually unlimited computing resources. An additional constraint that might be interesting to incorporate in the filter requirements are the characteristics of the class of input signals, which generally have a $1/f$ spectrum (i.e. square and triangle waves). It should even be possible to add additional constraints to the values (quantization) of the filter poles so the complex exponentials are easier to evaluate. \subsection{Dithering} The problem with aliasing is not the noise power per se, but the (periodic) patterns present in the aliasing noise, and the difference in noise level depending on the frequency. For the naive approach the aliasing can go from nothing for subharmonics of the sampling frequency, to quite noticable ``beating'' for frequencies close to such subharmonics. In addition, the noise power increases with frequency. A classic dithering trick might be applied to lessen the requirements for the anti--aliasing filter, by randomizing the quantification noise. In this scheme it would amount to pulse position modulation: shifting the time location of discontinuities by a random value take from a rectangular or triangular PDF. The point where dithering could be introduced is in the evaluation of the complex exponentials: if the algorithm used is an approximation, we could try to ensure its error is random. This only needs to be done when the input is periodic, in which case the algorithm is an updating one. \subsection{Filter Error Analysis} Parallel IIR realization is not so good in the stopband due to round--off errors: the algorithm relies heavily on cancellation. The irony is that this is exactly what we are forced to use! So high precision calculations are required. This makes one wonder if it is possible to transform the problem to something embedded in exact integer arithmetic. \subsection{High Frequency Approximation} So where lie the real trade--off? Are higher frequencies just more expensive? First, it is not the case that correct treatment of low frequencies is not necessary. Indeed it is the case that for low frequency square and sawtooth waves the noise caused by naive synthesis can be inaudible, but often these waveforms are used in a subtractive synthesis algorithm, where the lower harmonics are filtered out, rendering the error clearly audible: it is important to synthesize a correct harmonic spectrum for all frequencies. Second, the cost of updates is directly proportional to the frequency of discontinuities. At some point it becomes cheaper to use a discrete frequency domain approach: i.e. to synthesize the harmonics directly. What's needed here is to determine the point where the proposed algorithm becomes more expensive than an oscillator approach, and a way to smoothly transition between the algorithm and a bank of oscillators. %% \subsection{Polar coordinates} %% Something i've been wondering about for so many years is why it's not %% possible to represent filters in polar coordinates. The obvious answer %% is of course the computation of the logs and exponents, and the ubiquity %% of additions that make this impractical. But is it always like that? \section{Conclusions} \begin{itemize} \item Analytically compute state updates for simple discontinuities. \item Simplify the numerical computation of such updates in the case of constant frequency inputs. \item Error analysis of parallel filter bank. \item Quantization error diffusion for pulse position. \item Use as all--purpose resampler. \end{itemize} Entry: Dynamic Wavetable Synthesis Date: Thu Jun 18 16:54:50 CEST 2009 Type: tex \def\diff{ d } \def\difft{ \diff \over \diff t } \def\pdiff{ \partial } \def\pdifft{ \pdiff \over \pdiff t } This DRAFT paper is an attempt to pick up some research which shifted to the background in 2005 because of problems integrating it into a PhD program. I am currently at the point of isolating the problems and formulating them more explicitly. A first version of model has been implemented in Pure Data around 2000-2001 as the \verb+dynwav~+ object in the Creb extension library. \subsection{Time--Phase Model} Dynamic Wavetable Synthesis (DWS) approaches the sound synthesis problem by separating it in two parts. \begin{itemize} \item Synthesis of sequences of instantaneous wave shapes in a discrete $2$--dimensional representation. \item Interpolation of this representation to produce $1$--dimensional sound signals. \end{itemize} The novelty of the technique lies in the way \emph{pitch} and \emph{timbre} data are represented more abstractly, which simplifies the expression of algorithms that relate these two elements. This abstract representation is decoupled from output signal sampling details through \emph{interpolation}. Separating these two concerns makes them easier to manage. The DWS representation greatly simplifies the use of the Discrete Fourier Transform (DFT) as a link between the time--domain waveform and its harmonic spectrum, because the model domain consists of truly periodic functions. More generally this property makes it easier to link the representation with other linear or non--linear transforms. A unified $2$--dimensional interpolation approach replaces the need for an ad--hoc short--time windowing mechanism as is used in many contemporary techniques. Such an approach is used to bridge the world of stationary models and non--stationary sounds. The use of explicit interpolation in the DWS model also makes it easier to keep frequency aliasing and phase cancelling under control, more specifically because the interpolation step can tie into the frequency domain representation of the waveform directly. % In addition to describing synthesis, this paper also hints at possible % analysis strategies. Analysis however is not trivial in the presence % of multiple dominant pitches. In general it seems best to stick to % well-proven techniques and use the DWS model only as a sound signal % sythesis engine. % To keep the exposition short I will assume familiarity with the matrix % computation and interpolation techniques used. % This paper needs to be reworked to restate the 2 problems clearly, and % relate them to the underlying model properly. It also should be stated % clearly that the subject is basicly sound synthesis, and analysis is % added as an afterthought. Some other things that are missing: i % currently see no way for a clean solution of the two main problems, % being source/filter separation and pitch detection, other than trial % and error heuristics. \subsection{Continuous Model} The basis of the DWS representation is a function $p(\phi, t)$ defined on an infinitely long torus, with $\phi$ the periodic dimension and $t$ the linear dimension. The period is most conveniently taken to be $2\pi$. This model is then related to a signal $s(t)$ through a phase trajectory $\phi(t)$ as $$ s(t) = p(\phi(t), t). $$ The instantanious frequency or \emph{pitch}\footnote{Note that a more accurate term for pitch would be `inverse of period', since the perceptual quality pitch is not purely related to periodicity. In this document we treat both terms to mean the same thing.} is defined as $\omega(t) = {\difft} \phi(t)}$. This model is an alternative to time--frequency modeling, and allows us to represent pitch and timbre independently. One could call it ``time--phase modeling''. We define \emph{timbre} as the information associated to a single time instance $t_0$. This is associated to a function on the circle as $P_{t_0}(\phi) = p(\phi, t_0)$. \subsection{Discrete Model} A discrete version of the model can be constructed by choosing $N$ to indicate the number of points to sample on the circle, and performing the sampling along the $t$ axis at multiples of $H$ for the timbre and multiples of $T$ for phase trajectory and associated signal. This gives the sequences $$ \begin{array}{rcl} p_{kl} &=& p(2\pi l / N ,kH) \\ \phi_n &=& \phi(nT) \\ s_n &=& s(nT). \\ \end{array} $$ We define $P_{k}$ as the vector with $N$ components obtained from $p_{kl}$ with $l$ ranging from $0$ to $N-1$. Note that $H$, the ``haptic'' period that fixes the update step of the timbre information, can be much larger than $T$, the signal and phase update step. An important note is that there is no \emph{direct} relationship between the timbre sequence $p_{kl}$ and pitch trajectory $\phi_n$ on the one hand, and the signal $s_n$ on the other hand. They are \emph{only} related through the continuous functions $p(\phi, t)$ and $\phi(t)$. This means that \emph{interpolation} is an essential part in reconstructing $s_n$ from $p_{kl}$ and $\phi_n$. We will come back to the techincal details and inaccuracies of interpolation later, but will assume for now that it is possible to switch between the discrete and continuous model without loss of information. This will allow a slight abuse of notation in that $P_k$ (the concrete representation) implicitly refers to a region in the function $p(\phi, t)$. A modeling technique based on interpolation seems to be a drawback. However, it is the main goal of this paper to convince you that including this indirection and as a result being able to make the sampling for $p(\phi, t)$ fixed is a workable trade--off, since it greatly simplifies manipulation of $P_k$, which we are most interested in. \subsection{Frequency Domain} The frequency domain representation of $p$, defined by computing the Fourier Transform (FT) of $p(\phi,t)$ over the phase dimension $\phi$, allows the DWS model to be related to a sum--of--sines model. For each $l \in \ZZ$ this gives a fuction which represents the phase and amplitude of harmonic $l$ at time $t$. $$c_l(t) = {1 \over {2\pi}}}\int_0^{2\pi} \exp(- j l \phi) p(\phi, t) \diff \phi.$$ The inverse is given by $$p(\phi, t) = \sum_l c_l(t) \exp(j l \phi).$$ Interpolating the latter using the phase trajectory $\phi(t)$ gives a representation for $s(t)$ decomposed as a sum of harmonicly spaced, phase and amplitude modulated complex exponentials $$s(t) = \sum_l c_l(t) \exp(j l \phi(t))\\.$$ Writing $c_l(t) = a_l(t) \exp(j\phi_l'(t))$ with $a_l(t) \in \RR$ and setting $\phi_l(t) = \phi(t) + \phi_l'(t)$ gives the alternative formulation $$s(t) = \sum_l a_l(t) \exp(j l \phi_l(t)),$$ which models $s(t)$ as a sum of amplitude and phase modulated complex exponentials without the harmonicity relationship. The term ${\difft} \phi_l'(t)$, which is the rotation speed of the complex spectral components over time, can then be viewed as frequency offset of each exponential to the ideal harmonic frequency $n {\difft} \phi(t)$. This representation (harmonic + per component frequency offset) is useful later to introduce slight inharmonicities in mostly harmonic models, related to frequencies of beating patterns. \section{Relation to Existing Techniques} The DWS model is related to several other sound synthesis techniques. \begin{itemize} \item By eliminating time variation of the timbre ${\pdifft} p = 0$ we get ordinary wavetable playback. \item The DWS model is a variant of the Scanned Synthesis model, where $P_k$ is produced as the output of a discrete dynamical system. \item The DWS model can be related to the harmonic and inharmonic sum--of--sines model, by moving inharmonicities to the timbre sequence $P_k$. \end{itemize} The relation to ordinary wavetable synthesis isn't so interesting as it is fully contained within the DWS model. Relation to sum--of--sines was already explained, and is an essential element of DWS. It has to be stressed though that DWS does not work well for sounds which are not largely harmonic (harmonic + small phase differences) due to phase cancellation effects. This will be explained further in the section about interpolation. DWS is most related to Scanned Synthesis, and it could be argued that both techniques essentially talk about the same subject, be it from a different perspective. \subsection{Scanned synthesis} If we take the vector $P_k$ to be the output of a dynamical system sampled at $k$ rate (the haptic period $H$), we get the method of scanned synthesis. The most general form for a such a dynamical system is $$\matrix{X_{k+1} \\ P_{k}} = \psi(\matrix{X_k \\ U_k}),$$ where $X_k$ is the state vector, $U_k$ is the system input and $P_k$ is the system output. When the state update function $\psi$ is linear this is ususally represented by $$\matrix{X_{k+1} \\ P_{k}} = \matrix{A & B \\ C & D} \matrix{X_k \\ U_k},$$ where the poles of the system are determined by the eigenvalues of matrix $A$. This representation is not unique: we can transform $X_k$ to another base $X_k' = Q^{-1}X_k$ using an invertable matrix $Q$ with columns representing the new base for the state vector. The new system matrices then become $A' = Q^{-1}AQ$, $B = Q^{-1}B$, $C'=CQ$ and $D' = D$. If the eigenvalue decomposition of $A$ exists, the equations can be transformed using $Q=S$, with $S$ the matrix containing the eigenvectors of $A$. This makes $A'$ diagonal, containing the eigenvalues of $A$. Let's look at the diagonal form for a system without inputs, which means originally $B=0$ and $D=0$. With $S$ the matrix composed of eigenvectors as columns and $E = S A S^{-1}$ a diagonal matrix containing the corresponding eigenvalues, the system $$\matrix{X_{k+1} \\ P_{k}} = \matrix{E \\ S} \matrix{X_k}$$ can then be interpreted as a bank of oscillators which is a mix of the ``eigen waveforms'' contained in $S$. The mixing amplitude and phase is directed by $k$ independently decaying envelope functions. A special case arises when $A$ is circulant, meaning $a_{ij} = a_{i'j'}$, with $i' = (i-1) \mod N$ and $j' = (j-1) \mod N$. It ensures that each element of $X_k$ has the same relationship to its neighbouring elements, with all elements arranged on a circle. If $A$ is circulant it can be diagonalized by the discrete Fourier transform, with $Q = F$, a matrix with columns made up of DFT base functions. This means the elements of the state vector $X_k$ represent the instantanious amplitudes and phases of the DFT base functions. When all eigenvalues of $A$ have a magnitude strictly smaller than $1$, the signal associated to this model after interpolation is a sum of harmonicly spaced damped exponentials. \section{Interpolation} Reconstructing the function $p(\phi,t)$ from a sequence of $P_k$ and obtaining a discrete signal $s_n$ from the phase trajectory $\phi_n$ is hindered by two phenomena. \begin{itemize} \item Frequency aliasing of $s(t)$. Suppose it is possible to perfectly reconstruct $p(\phi,t)$ from $P_k$ and $\phi(t)$ from $\phi_n$. The resulting signal $s(t) = p(\phi(t),t)$ might contain frequency components that cannot be represented in the discrete domain and will create aliasing distortion. \item Reconstruction of $p(\phi, t)$ from $P_k$, in the case where $P_k \in \CC^N$ might be hindered by phase cancellation phenomena in the case where naive linear interpolation is used. This happens when sounds are inharmonic. This is related to undesirable modulation problems in windowed sinusoidal modeling. \end{itemize} The challenge is to divise an interpolation strategy that minimises the error due to these two effects. In general the solution is to filter the $P_k$ before interpolation and to minimise phase cancellation by interpolating the phase trajectories instead of the amplitudes. The former is relatively straightforward. The latter is quite difficult as it is expensive to perform. Both problems are straightforward to solve in the frequency domain (DFT of $P_k$). The challenge lies in determining a proper sampling point \emph{inbetween} successive $P_k$. This largely depends on the contents of the $P_k$ and the sampling period determined by $\phi_n$. Another question is: for inharmonic sounds, does it \emph{really} matter so much to get phases wrong here and there? There is already a lot of cancelling and amplitude modulation going on as a consequence of the inharmonicities. Not getting this exactly right might not be so problematic after all. \section{Formant Structure} An important drawback of wavetable based synthesis is the coupling between pitch and \emph{formant} structure, which refers to the spectral envelope of a signal, which can be properly defined as the result of low order \emph{linear prediction}. In a pure synthesis framework this relation can be made explicit. Simply make sure that the $P_k$ are postprocessed with a spectral envelope which cancels the phase trajectory frequency, keeping the resulting spectral envelope of $s_n$ under control. Since this is rather trivial it will no longer be discussed. However, can this process be automated by performing operations on $P_k$ in a formant--neutral way? It is straightforward to \emph{stretch} a formant spectrum by moving the poles around. This raises the question about placing standard linear prediciton modeling techniques in a \emph{periodic} framework. There are probably a lot of shortcuts that can be exploited by involving the DFT. \subsection{Periodic Linear Prediction} Entry: Implementing Fermat Prime order Galois Fields Date: The Jun 18 22:48:20 CEST 2009 Type: tex \def\F{\mathcal{F}} \def\R{\mathcal{R}} \section{Introduction} Some words on the ring of integers modulo $2^{2^{k+1}}-1$, which is easily implemented using ordinary hardware operations. Such a ring can be used to implement the finite fields with $2^{2^{k}}+1$ elements, for $0 \leq k \leq 4$. \section{Embedding $\F_a$} Let $\F_a$ denote the field of integers modulo the prime $a$, and $\R_a$ the unital commutative ring of integers modulo $a$, not necessarily a prime. We want to find out how to embed a field $\F_a$ in a ring $\R_{ab}$, where $b$ is not necessarily prime. Consider the map $$f : \R_{ab} \to b\R_{ab} : x \mapsto x b e = x 1_a,$$ which is a map from $\R_{ab}$ to its ideal $b\R_{ab} = \{ x b : x \in \R_{ab} \}$. If we can find an element $1_a = be$ in $b\R_{ab}$ such that $1_a ^ 2 = 1_a$, this map is a ring homomorphism. If $1_k \neq 0$, this map is nontrivial, and it gives a field structure to the elements of the ideal $b\R_{ab}$. The fact that $f$ is a ring homomorphism follows from $f(x)+f(y) = x 1_a + y 1_a = (x+y) 1_a = f(x+y)$ and $f(x)f(y) = x 1_a y 1_a = x y 1_a^2 = x y 1_a = f(xy)$. If we require $0 < e < a$ we have $1_k \neq 0$, and all the $a$ elements in $b\R_{ab}$ will be reached by $f$. The kernel $K_f$ of $f$, determined by $\{x : x1_a = 0\}$, has $b$ elements, and is identified as $a\R_{ab}$, the ideal generated by $a$. This ideal is maximal since $a$ is prime, so the quotient $\R_{ab} / a\R_{ab}$ is a field containing $a$ elements, which is essentially the same as $\F_a$. The canonical decomposition of $f$ seen as an endomorphism of $\R_{ab}$ can be decomposed as $$f = e \circ i \circ h,$$ where $$h : \R_{ab} \to \R_{ab} / a\R_{ab} : x \mapsto x + a\R_{ab}$$ is the homomorphism onto the quotient field which introduces the field structure, $$i : \R_{ab} / a\R_{ab} \to b\R_{ab} : x + a\R_{ab} \mapsto x1_a$$ is the isomorphism from this quotient field to the ideal generated by $b$, and $$e : b\R_{ab} \to \R_{ab} : x1_a \mapsto x1_a $$ is the embedding of this ideal in the ring. This is the embedding we're after, it carries the field structure introduced by $h$ and preserved by $i$ over to a subset of $\R_{ab}$. \section{Mersenne factors} We call the number $M_{k} = 2^{2^k}-1$ a Mersenne number and $F_{k} = 2^{2^k}+1$ a Fermat number. For $k \geq 0$ we have $$M_{k+1} = F_{k} M_{k}.$$ For $0 \leq k \leq 4$ the $F_{k}$ are prime, this makes $M_5 = 2^{32}-1$ rather special, as the product of the first 5 Fermat primes. \begin{align*} 2^{32}-1 &= (2^{16}+1)(2^{16}-1) \\ &= (2^{16}+1)(2^{8}+1)(2^{8}-1) \\ &= (2^{16}+1)(2^{8}+1)(2^{4}+1)(2^{4}-1) \\ &= (2^{16}+1)(2^{8}+1)(2^{4}+1)(2^{2}+1)(2^{2}-1) \\ &= (2^{16}+1)(2^{8}+1)(2^{4}+1)(2^{2}+1)(2^{1}+1) \\ \end{align*} Note that this is the end of the known chain for prime Fermat numbers. As of this time, for $n\geq5$ all \emph{tested} Fermat numbers $2^{2^n}$ are composite. Wether there are more Fermat primes is not known, and not really of interest to us now. \section{Circular Addition} Why are these Mersenne numbers interesting? Addition modulo $M_{k}$ can be seen as \emph{circular binary addition}. Looking at the structure of a binary adder, each bit is connected to the last using a \emph{carry bit}. By linking the carry output of the most significant bit's adder to the carry input of the least significant bit, one obtains addition modulo $M_k$, with $2^k$ the number of linked adders. In case $N=2^k$ is the wordlength of a machine, or a multiple thereof, the operation of reducing a number of $2N=2^{k+1}$ bits to one modulo $M_{k}$ is simply \begin{align*} c 2^N + x &= c(2^N - 1) + c + x\\ &= c + x \end{align*} which is very cheap to do if $N$ is a multiple of the machine wordlength. For addition of two numbers modulo $\mod M_k$, the result has $N+1$ bits, with $c$ the carry flag. For multiplication, the result has $2N$ bits. This results in the instruction sequences ``ADD,ADDC'' and ``UMUL,ADD,ADDC''. So, implementing the ring $\R_{M_k}$ is efficient. This is not so interesting in itself, since addition modulo $2^{2^n}$ is a ring we get for free. However, this does enable us to implement the field $\F_{F_{k-1}}$ in an efficient manner. [Note] This does not reduce $111\ldots1$ to $000\ldots0$, so we have two elements representing $0$. \section{Algebraic structure of $\F_{F_k}$} \def\R{\mathcal{R}} The multiplicative group of the fields $\F_{F_k}$ is the cyclic group of order $N=2^{2^k}=F_k-1$ and is thus highly composite. This has the advantage that a fast radix $2$ FFT-like algorithm exists to compute the number theoretic transform $$X_n = \sum_{k=0}^{N-1} x_n \omega^{-nk},$$ with $\omega$ a primitive root of unity. Here $\omega$ is a generator of the multiplicative group, meaning the smallest positive integer $k$ satisfying $\omega^k = \omega$ is $N=2^{2^n}$. The inverse transform is given by $$x_n = N^{N-1}\sum_{k=0}^{N-1} X_n \omega^{nk}.$$ \section{Finding $1_{F_k}$} Following our general notes, it is possible to represent $\F_{F_k}$ using the ideal $M_k \R_{M_{k+1}}$ of the ring $\R_{M_{k+1}}$ if we can find the unit element $$1_{F_k} = M_k e_k.$$ In order for this unit to yield the homomorphism $f_k$ which will lead us to a representation of the field $\F$, we need $$1_{F_k}^2 = 1_{F_k} \mod M_{k+1}.$$ All terms, including the modulo divisor, are a multiple of $M_{k}$, so we can safely divide out this number yielding the equation $$M_k e_k^2 = e_k \mod F_k.$$ Since $F_k$ is prime, we can take inverses, so we can multiply both sides by $(M_ke_k)^{-1}$, which gives the solution $$e_k = (M_k)^{-1} \mod F_k.$$ This gives us $1_{F_k}$. We can do better and find an explicit expression by observing that \begin{align*} e_{k} &= (F_k - 2)^{-1} \\ &= (-1) 2^{-1} \\ &= 2^{2^k - 1} \end{align*} using $(-1)^{-1} = -1 = 2^{2^k}$, all modulo $F_k$. This gives a nice expression for the unit $$1_{F_k} = 2^{2^k - 1} (2^{2^k} - 1).$$ In binary this gives \begin{align*} 1_0 &= 01 \\ 1_1 &= 0110 \\ 1_2 &= 0111.1000 \\ 1_3 &= 0111.1111.1000.0000 \\ 1_4 &= 0111.1111.1111.1111.1000.0000.0000.0000 \end{align*} % is dit eigenlijk wel uniek? lijkt me van wel, maar ben niet helemaal zeker Entry: Matrix Decompositions Date: Fri Jun 19 10:57:12 CEST 2009 Type: tex \subsection{Comparing Matrices} There are different ways in which matrices can be \emph{alike}. The following talks about 3 relations called \emph{equivalent}, \emph{congruent} and \emph{similar}. Two matrices $A$ and $C$ are equivalent if there exists non--singular matrices $R$ and $Q$ such that $$C = RAQ.$$ Two matrices are equivalent iff they have the same rank. This is related to Gaussian elimination for solving linear systems and computing a matrix inverse. If $R=Q^{-1}$ the matrices are \emph{similar}. They represent the same linear transform, but in another basis. All square matrices are similar to their Jordan form. Normal matrices are similar to a diagonal matrix. A consequence is that similar matrices have the same eigenvalues. If $A$ and $C$ are hermitian, and $R=Q^T$, the matices are \emph{congruent}. They are both congruent to the same signature matrix, so have the same signature, and represent essentially the same quadratic form, but in another basis. If two matrices are \emph{both} similar and congruent, they are called orthogonally similar and $R=Q^T=Q^{-1}$. Hermitian matrices are orthogonally similar to a real diagonal matrix. \subsection{Matrics Semantics} When studying matrix algorithms, it is important to keep in mind what they represent. \begin{itemize} \item Both the normal form for the eigenvalue decomposition and the more general Jordan form study the \emph{linear transform} related to a matrix, by performing a change of basis. Here a the matrix represents a linear function $f(x) = Ax$. \item The normal form for the congruence transform studies the \emph{quadratic form} associated to a symmetric or hermitian matrix, by transforming a change of basis. More generally, a symmetric matrix $A$ can represent a \emph{bilinear form} $f(x,y) = x^TAy$. Likewise a \emph{symplectic form} is represented by antisymmetric matrix. \end{itemize} %[1] http://mathworld.wolfram.com/EquivalentMatrix.html Entry: Matrix Multiplication Date: Fri Jun 19 10:58:12 CEST 2009 Type: tex % GOLUB & VAN LOAN: % 3 manieren om matrix vermenigvuldiging te zien % * inner product A^T B % * outer product A B^T % * lineair combination of columns of A / rows of B % dus: Ax = b % MATRIX = operator, acting on (row) vector x on its left % -> rows are linear functionals, mapping vector x to a coordinate of b (number) % -> columns are base vectors, coordintes of x map these columns to vector b % exchange all these left--right row--column: x^T A^T = b^T % MATRIX = dual operator, acting on (column) vector = linear functional x^T on its right % -> columns are linear vectors, to which the linear functional x is applied to give a coordinate of b^T % -> rows are linear functionals, coordinates of x combine these to functional b^T % in mensentaal % vector = pijl % linear functional = coordinaatextractor % hetgeen wat alle problemen veroorzaakt: % -> linear functional = ook een vector % -> matrix, interpreteerbaar als een set van functionals, of een set van vectoren, is ook een vector! % matrices (lineare transformaties) zijn vectoren die je niet alleen kan optellen, maar ook kan vermenigvuldigen! % aint this fun :) % toch vreemd he, hoe zo'n absurd ding als matrixvermenigvuldiging zo % verwoven kan geraken met automatisme zodat ge eigenlijk niet meer % weet wat ge aan het doen zijn. en als ge gaat kijken wat ge aan het % doen zijt blijkt het verschillende interpretaties te hebben. of hoe % wiskunde een soort middenweg kan zijn, die verschillende dingen die % duidelijk verschillende dingen lijken samenvat in 1 ander abstrakt % ding met de verschillende dingen als eigenschappen.. dus dit hoort % hier eigenlijk niet echt thuis, maar het is een mooie oefening om % mijn intuitie door de jaren opgebouwd eens proberen te vatten in % woorden. vrij moeilijk. The algebra of finite dimensional linear operators is associative. This means that the objects that \emph{operate} and the objects that are being \emph{operated on} are somehow interchangable, because of the law of associativity, as shown in $$ABx = A(Bx) = (AB)x.$$ Semantically, this has quite some implications. If $A$ and $B$ denote linear transforms operating on the object on their right, and $x$ denotes a vector, the above means the transform of a transformed vector is equal to a transformed vector, with the latter transform the combination of two other transforms. % Maybe it's just because I've been out of the loop for a while, and I % am re--discovering the beauty of math in general and the beauty of % linear algebra in particilar, but in this particular case, the % frivolous interplay of the \emph{concrete} and the \emph{abstract} % is pretty amazing. Reality really changes by finding a good way to % describe it\ldots Instead of talking about vectors and transforms in an abstract way, it is possible to make matters more \emph{concrete} by fixing a basis and working only with coordinates taken from the field from which the vector space is constructed. For linear transforms these coordinates are expressed as matices. Probably the reason why engineers like matrices is that in some sense it is simpler to talk about structured collection of numbers and rules for addition and multiplication, than about abstract entities. However, we should never forget that it sometimes makes sense to talk about the objects they represent more abstractly\footnote{ To me, and to most people I know, \emph{understanding} basic linear algebra, next to the opaque rule of matrix multiplaction has always been about putting it in terms of simple geometry. To see what they \emph{do}. Points, line segments, cubes and spheres transformed under rotations, translations, reflections, scalings and projections, and combinations thereof. A common pitfall in thinking about rotations is to see (elementary) rotations happen around an axis. Rotations really happen in a plane. This is clearest to see in $2$ dimensions, since there is not much left to rotate around, except for a single point. Only in $3$ dimensions there is always a $1$--dimensional subspace invariant under the rotation, also called the axis. When we go to $4$ dimensions, there is a $2$--dimensional subspace invariant under (elementary) rotations. It is even possible to combine two independent elementary rotations. Independent in the sense that they commute, which is not true for rotations in general. This is equivalent to saying a $4$--th order linear system can have $2$ distinct eigenfrequencies. An interesting thing to visualize is the nilpotent transform, which can be seen as a rotation/reflection followed by a projection, where the rotation is such that the space onto which is projected does not contain a subspace which is invariant under the rotation/reflection. In other words, the projection and rotation work as ``squeeze'' and ``turn'' such that when both operations are iterated a couple of times, the result eventually shrinks to a point. In two dimensions this can be visualized as collecting bread crums spread out over the surface of a table into a pile in the following way. Rotate the table $90$ degrees, wipe all the crums to the left side using the ruler. Then do this again. Of course, nilpotent transforms are much easier visualized as shifts of coordinates of vectors, where on each shift, one or more things disappear into the void.} because the definition of matrix multiplication $$c_{ij} = \sum_k a_{ik} b_{kj}$$ is not very intuitive\footnote{There is unfortunately just one thing to do with this: figure out what it means when you see two of these grids next to each other, and train yourself to do the manipulation without thinking about how to do it. For me it is a very visual thing: I move the columns of $B$, rotate them $90$ degrees counterclockwize so they align with the rows of $A$. Then, in close proximity, they \emph{zip up} into a single number.}. Here $k$ ranges from $1$ to the \emph{width} of $A$, with elements $a_{ij}$ on row $i$ and column $j$. This range is equal to the \emph{hight} of $B$. In the following I will use the ``engineering shortcut'' of talking about vectors, linear forms and linear transforms as if they were the same as matrices of dimension $N \times 1$, $1 \times M$ or $N \times M$ respectively. It is important to see that mathematically, they are \emph{distinct} objects. The algebra of matrices \emph{behaves as} the algebra of abstract linear operators because both are related by an isomorphism tied to a particular basis in the abstract vector space. In matrix algebra the \emph{transpose} operation represents the endomorphism of a (finite) abstract vector space which relates vectors and linear forms. The ease at which the matrix multiplication formula can be re-interpreted is exactly because of the existence of these morphisms. % oppassen met algemeenheden: tis niet noodzakelijk allemaal % orthogonaal! maar dat zeg ik ook niet.. similar tot jordan vorm ist % wel ok An interesting thing to observe about the matrix multiplication rule is that it is recursive. This means you can chop 2 matrices you want to multiply in rectangular pieces, give the pieces a name, and apply the multiplication rules to the pieces. The result is the same, only less or more \emph{unrolled}. There are only 2 things that need to be taken into account. One is that multiplication is not commutative, and two is that the number of horizontal pieces on the left and the number of vertical pieces on the right are the same. So, we can interpret the $a$'s, $b$'s and $c$'s in the matrix multiplication formula not only as scalars, but as matrices or vectors, taking into account that we loose commutativity doing so. % This is an illustration of the fact that matrices can be defined % over rings. This observation allows us to cast some light on the different operations a matrix multiplication can represent. Let's start our journey with $$Ax = b.$$ This is usually associated with a set of $M$ linear equations in $N$ unknowns, but only if we really don't know $x$. The matrix $A$ has dimensions $M \times N$, and both $x$ and $b$ are row vectors (matrices with width 1) of dimension $N$ and $M$ respectively. There are several ways to look at this. The most obvious one is the more abstract one: the matrix $A$ is the representation of an operator, operating on the vector $x$ on its right. The result of this operation is another vector $b$. Now, taking a closer peek at the internals of $A$, we recognize it is built out of $N$ columns of numbers. Each of the columns of $A$ can be seen as a vector so we give column $i$ the name $a_i$. This enables us to write the previous formula using the notation of submatrices as $$\matrix{ a_1 & a_2 & \ldots & a_N } \matrix{ x_1 \\ x_2 \\ \vdots \\ x_N } = b.$$ Here the $a_i$ and $b$ are $M$--dimensional vectors, and the $x_i$ are scalars. Working out the matrix product using the usual rules gives us $b = \sum_i a_i x_i$. Moreover, because $x_i$ are scalars we have also $b = \sum_i x_i a_i$. So $b$, a column vector, is a linear combination of the vectors $a_i$, which are in our case the columns of $A$. This expresses $b$ (vector) in terms of coordinates $x_i$ (number) with respect to the basis $a_i$ (vectors). The space spanned by all possible combinations is called the column space. % \footnote{ Note % that this looks like an interior product. When $M = 1$ the $a_i$ are % scalars, and we have an interior product between two % $N$--dimensional vectors. Can we interpret the case $M > 1$ as an % interior product? Depends on what we mean by interior product. If % we interpret interior product as a sort of \emph{average} combined % with a \emph{scaling}, then indeed, the analogy holds.}. % Note, this gives a nice interpretation of what orthogonality % means. If you take two strings of numbers, $a$ and $b$, and you take % the average of the first string according to a weighing defined by % the second string, both are said to be orthogonal if the weighing % yields an average of zero: $a^T b = 0$. % This has the nice interpretation of distance and mass. Suppose $a$ % represents the masses of a collection of fruits, $b$ represents a % possible arrangement of the fruits on a scale, indicating distance % (positive or negative) to the suspension point. The scale is % balenced if and only if the vectors $a$ and $b$ are orthogonal. Another way to chop up $A$ is like $$\matrix{ a_1^T \\ a_2^T \\ \vdots \\ a_M^T } x = \matrix{ b_1 \\ b_2 \\ \vdots \\ b_M }.$$ Here $a_i^T$ are the rows of $A$, written in terms of the column vectors $a_i$, the columns\footnote{I've found it benificial in most cases to represent row vectors as the transpose of column vectors, not to confuse my visual system too much when recognizing formula patterns.} of $A^T$. This arrangement expresses the scalar elements of $b$ as inner products between $x$ and rows of $A$, or $b_i = a_i^T x$. Written as this, $x$ takes the average of the elements on each row of $A$ independently, yielding $M$ scalars. Another way to put this is that the rows of $A$ represents linear functions (functionals) mapping vectors to numbers, and each row applied to $x$ gives one element of $b$. Once you can interpret the formula of matrix-vector multiplication, it easily generalizes to matrix-matrix multiplication $AX = B$, with $X$ and $B$ matrices containing columns $x_i$ and $b_i$ respectively. Which gives $$\begin{array}{lcl} AX & = & A \matrix{x_1 & \ldots & x_K}\\ & = & \matrix{Ax_1 & \ldots & Ax_K}\\ & = & \matrix{b_1 & \ldots & b_K}\\ & = & B. \end{array} $$ Matrix multiplication on the left can be seen as an independent transform of all the \emph{columns} of the matrix it operates on. This can be seen as a function which maps a matrix $X$ to a matrix $B$. Note that the only thing we are doing is still to chop matrices up into parts, and express the relationships of the parts by looking at the matrix multiplication formula from different angles. Taking the transpose of the expression above gives the dual interpretation: a matrix operating on the right independently transforms the \emph{rows} of a matrix\footnote{For me it makes sense to only think in terms of ``operator is on the left, all columns of the operand are operated on independently'', instead of mixing the left--right and row--column stuff. Because of non--commutativity, left and right \emph{does} matter a lot, but luckily, they are related by the transpose (or Hermitian conjugate).}. % \footnote{ Note, that the dual of a vector refers to a vector in the % space of linear functionals, the space of all linear transforms from % the vector space to the base field. Since we work in the less % abstract settings of $\RR^N$, vectors are column vectors, while their % duals, the linear functionals, are row vectors, transposes of column % vectors. Application of a functional is nothing but the % multiplication of $a^T$ with a vector $x$, or $a^Tx$, which is the % inner product of vectors $a$ and $x$.}. % Strange how the brain does not automaticly incorporate this kind of % symmetry. Well, you can learn it though. % Note that operator on the left/right relates to computer languages % like LISP or FORTH. Let's interpret a multiplication differently, as inner and outer products. Take vectors $a$ and $b$, both unit norm, so we can concentrate on the effect of the angle between these vectors. Their length has no real effect on products, other than a change of scale, so we can safely leave them out of the picture. We form the inner product as $b^Ta$. This is a scalar, equal to the cosine of the angle between both vectors. If this product is zero, $a$ and $b$ are said to be orthogonal. The inner product of a vector and itself is its norm squared. In our case, norm is $1$, meaning angle is $0$, meaning both vectors are the same. The outer product $b a^T$ is quite different. It is a linear operator (a matrix) which represents a projection on the space spanned by $a$, followed by a rotation in the direction of $b$. We can also call this a reflection, since there's no handedness to preserve in a 1--dimensional space. Rotations (even number of reflections) and reflections (odd number of reflections) have essentially the same effect. The outer product of a vector and itself, $a a^T$ is a projection operator onto the space spanned by $a$. % Now, we generalize this to multiple dimensions. % So, things to remember. A matrix multiplication is: a collective % transform of a (bunch of) vector(s), or a recombination of a bunch % of vectors or scalars. % A matrix multiplying from the left operates on columns % independently. A matrix multiplying from the right operates on rows % independently. Let's have a look at inverses. With $B^T = A^{-1}$ we have $B^T A = I = AB^T$. So the columns of $B$ and $A$ form mutually orthogonal basis sets. For one column $b_j$ of $B$ this means it is orthogonal to all the columns $a_i$ of $A$ with $j \neq i$. And we have $b_i^T a_i = 1$. What does this mean? Suppose we write $A A^{-1} = A B^T$ as $$ \matrix{a_1 & \ldots & a_N} \matrix{b_1 & \ldots & b_N}^T $$ This gives us $I = \sum_k a_k b_k^T$, a complicated expression for doing nothing! Digging into this we can, given any independent set of vectors $a_i$, write any vector $x$ in terms of it, by projecting it on a dual basis $b_i$. In other words, associated with any basis $a_i$, there is a set of linear functionals $b_i$, which when applied to a vector as $b_i^Tx$, give the cordinates of $x$ in the new basis $a_i$, yielding the expression $x = \sum_k a_k (b_k^T x)$. What can we learn about this? What is this duality all about? In fact, it gives us always two ways of looking at the same thing. For example, we saw above that $A^{-1}$ is just a disguised form of some other matrix $B^T$. An inverse can always be interpreted as the dual of some other matrix, where this other matrix contains the dual basis for the columns of the original matrix. Stretching this a bit, a matrix can either be interpreted colum-wize, as a collection of base vectors, where the vector it operates on is a set of coordinates Or it can be interpreted row--wize, as a collection of linear functionals producing a set of coordinates of some basis. Operating \emph{on} a matrix from the left changes rows independently, from the right changes columns independently. Conjugate (transpose, Hermitian conjugate) exchanges the dual views. % Next to that, we can chop it up into blocks to aid our understanding % of certain properties and operations. % Of course, in the abstract setting, coordinates are always relative to % some basis. Matrices are defined exactly in this matter, as grids of % coordinates, and \emph{true} vectors are coordinate free % entities. This doesn't remove the fact that a matrix, a block of % numbers, \emph{behaves} as a vector, so we can freely interpret it as % we like, either as a vector being acted on, or as a set of coordinates % explaining one vector in terms of other vectors. It's this footloose % property that makes linear algebra so powerful for formulating % real-world phenomena, while remaining very earth--bound, because it's % just a pack of numbers in the end. So, how to read matrix multiplications? When you see the same matrix on both sides of another one, there is both a row \emph{and} a column transform going on. In other words, there is a transform to another basis and back again. There are basicly two types: a congruence transform when the expression is about the signature of a quadratic form, and and a similarity transform, when the relative scaling (eigenvalue decomposition) is studied. For example the similarity transform $X' = Q X Q^{-1}$, with $Q$ regular. The columns of $Q$ represent a basis. So, building upon our investigation of what an inverse actually is, the first effect of $X'$ is to yield the coordinates of the vector it operates on in terms of the columns of $Q$. Then, in this new coordinate system, the operation $X$ is performed on the coordinates, after which, the coordinates are used to build a vector using as basis the columns of $Q$. When the eigenvalue decomposition exists, $X$ can be made diagonal, so the effect of $X'$ can be decoupled into a set of independent actions on the coordinates in the basis represented by the columns of $Q$, which in this case are the eigenvectors. If you see $Q X Q^T$ then you can safely bet that $X$ is symmetric, and we are mostly interested in its signature, i.e. the kind of quadratic form the matrix $X$ produces. Entry: Math and Music: Harmony vs. Melody Date: Fri Jun 19 16:04:53 CEST 2009 In a talk with Axel yesterday the harmony / melody topic came up. Axel's point was that it might be better to study harmony or a harmonious instrument (one that can play chords like guitar, piano) before studing melody. My remark was that it's strange to see melody from a mathematical point, because if you look at frequency ratios, what you see first is harmony. Then in the background of this one could develop tone scales using harmonic intervals, but it's never possible to really decouple melodic structurs from the harmonic background. Axel remarked that in the history of (western?) music there was this move from harmony -> melody -> re-interpreting harmony in terms of melody. That in classical music, harmony is somewhat implicitly understood with melody brought to the front. Entry: Complex Number Wave Shaping Date: Sat Jun 20 15:25:27 CEST 2009 Type: tex The idea present in this paper is related to the operations of \emph{modulation}, \emph{wave shaping} and \emph{sample playback}. These can be represented respectively by an injection (or a curve) $\RR \to \MM$ where $\MM$ is a manifold, a transformation $\MM \to \MM$ and a projection $\MM \to \RR$. Combining these operations allows the characterization of sound signals as $\RR \to \RR$ maps. \section{Complex Numbers} Before going to exotic structures, let's have a look at $\MM=\CC$. A sinusoidal oscillator can be modeled using the curve $f(t) = e^{i\omega t}.$ The trivial projection onto $\RR$ is to take the real or imaginary part. Using these this we can concentrate on transformations of $\CC$. Sticking to transformations that can be implemented directly using arithmetic functions we get addition, multiplication and function composition which lead us to polynomials, and division which leads us to infinite power series. % This paper is mainly intended as an illustration of the elegance of % working with complex signals, as opposed to the more cumbersome real % signal approach. \section{Polynomials} Polynomials in $\CC$ are closed under addition, multiplication and function composition, so it is interesting to look at what we can do with them. Performing $f(x) = x^2$ on $e^{j\omega t}$ yields $e^{2j\omega t}$. Squaring doubles\footnote{In $\RR$ on the interval $[-1,1]$ the same behaviour can be produced by the Chebyshev polynomials $T_n$. They satisfy the relation $T_n(\cos\theta) = \cos n \theta$.} the frequency. In general an order $N$ polynomial $$p(x) = \sum_{n=0}^N p_n x^n$$ produces a mixed spectrum according to $$p(e^{j\omega t})=\sum_{n=0}p_ne^{nj\omega t}.$$ This means we can talk in terms of the polynomials instead of the terms $e^{j\omega t}$, with $p_n$ corresponding to the amplitudes of the harmonics. The following talks about the polynomials $p(x) = \sum^N p_n x^n$ and $q(x) = \sum^M q_mx^m$. To simplify notation we take $p_n$ and $q_n$ to take the value $0$ when they fall outside of the range $0,\ldots,N$ and $0,\ldots,M$ respectively. Adding two polynomials produces another polynomial, with the coefficients added per term $$p(x) + q(x) = \sum^{\max(N,M)} (p_n + q_n)x^n.$$ Multiplying polynomials \emph{convolves} the spectra, which means that one spectrum is spread out with the pattern of the other spectrum. $$p(x) q(x) = \sum_{n=0}^{N+M} (\sum_{k+l=n}p_k q_l)x^n.$$ Composing two polynomials yields $$p(q(x)) = \sum_{n=0}p_n (\sum_{m=0}^M q_n x^M)^n$$ which has a less trivial closed form. Apart from special cases, like the Chebyshev polynomials which obey the nesting property $T_n(T_m(x)) = T_{n m}(x)$, composition and more specifically iterated composition is the realm of \emph{dynamical systems}. % \section{Special Polynomials} \section{Analytic Functions} Next to polynomials, \emph{power series} $$f(z) = \sum_{k=0}^{\infty} a_k z^n$$ are an interesting tool. They are a natural generalization as polynomials of infinite order. Power series represent analytic functions. Just as with polynomails, the coefficients $a_k$ represent the harmonic content of the resulting signal. % OPGELET: filters zijn ''fourier multipliers'', en geen waveshapers!!! % This is the ``mathematical'' Z--transform\footnote{ Note the duality % between time and frequency analysis, looking at the correspondence % between filtering and waveshaping. In discrete filter design and % analysis, the spectrum is continuous and periodic and lives on the % complex unit circle. Filters are causal, infinite length, discrete % sequences (discrete impulse responses). Here we turn the picture % around: signals live on the complex unit circle and are periodic % and continuous. Spectra are discrete (harmonic) infinite % series. I.e. rational waveshaping functions are the equivalent of % finite order discrete IIR linear filters, which are ``spectrum % waveshapers''. Finite waveshaping polynomials are in the same % sense equivalent to discrete FIR filters.}. % The function $\frac{z}{|z|}$, which maps the complex plane onto the % unit circle, can be used as a general purpose (nonlinear) normalizig % function. Since it is not differentiable, it is not an analytic % function and does not have a power series expansion, so it is not % very useful for analysis and should be used with care. \subsection{Fractional Linear Transforms} The function $$f_{a,\theta}(z) = e^{i\theta}\frac{z - a}{1 - \overline{a} z}$$ with $z,a \in \mathbb{C}$, $\theta \in \mathbb{R}$ and $|a| \neq 1$ is an invertible map from the unit circle onto itself. It is called a Fractional Linear Transform (FLT). If $|z| = 1$ then $|f_{a,\theta}(z)| = 1$. If $|a| < 1$ this function is analytic on the open unit disc. The first thing to remark is that the $f_{a,\theta}$ form a group with composition the group operation. The inverse element of $f_{a,\theta}$ is $f_{-a,-\theta}$. What this means is that performing this operation multiple times can always be related to a single such operation, so it isn't terribly interesting. % In the dual representation, this is of course the z transform of a % discrete all-pass filter. The more general fractional linear % transform can be used as well. The dual being a complex one % pole/zero filter. However, combining $f$ using \emph{multiplication} is meaningful. It gives rise to Schur functions. More specificly Blaschke products $f(z) = e^{i\theta} \prod_k \frac{z-a_k}{1-\bar{a_k}z}$. The Schur algorithm gives an alternative representation for this $$f_k(z) = \frac{1}{z}\frac{f_{k-1}(z) - \rho_k}{1 - \bar{\rho}_k f_{k-1}(z)},$$ with $\rho_k = f_{k-1}(0)$. This means that when we \emph{multiply} the outputs of several first order waveshapers we get a higher order waveshaper with a richer behaviour. \subsection{The Discrete Summation Formula} What happens when we use the $a$ parameter as an input? We set $\theta=0$ in the following because it will only amount to a constant phase shift. Let $a = r z'$ with $|z'| = 1$. Suppose $z$ and $z'$ are two unit norm complex oscillators defined by $z(t)=e^{i\alpha t}$ and $z'(t) = e^{i\beta t}$. Our modulator driven by the two oscillators has the form$$ \frac{z-rz'}{1-r \overline{z'}z}.$$ The signal generated is $$s(t) = \frac{e^{i\alpha t} - r e^{i\beta t}} { 1 - r e^{i(\alpha-\beta)t}}$$ and is periodic when $\frac{\alpha}{\beta}$ is rational. The constant $r$ can be seen as a modulation index. This can be written in a more useful way as $$s(t) = e^{i\alpha t}\frac{1 - re^{-i(\alpha-\beta) t}} { 1 - r e^{i(\alpha-\beta)t}}.$$ The factor $e^{i\alpha t}$ is just a frequency shift and the second factor only depends on $\delta = \alpha-\beta$, so we have $$s(t) = e^{i\alpha t} s'(t) (1 - re^{-i\delta t}),$$ with $$ s'(t) = \frac{1}{ 1 - r e^{i \delta t}} = \sum_{k=0}^\infty r^k e^{i k \delta t}.$$ This expansion is the limit case of the discrete summation formula (DSF) $$ \frac{1 - r^N e^{i N \delta t}}{ 1 - r e^{i \delta t}} = \sum_{k=0}^{N-1} r^k e^{i k \delta t}.$$ This establishes the link between what we can call FLT modulation and harmonic spectra with an $1/f$ envelope. The spectrum of $s(t)$ is thus harmonic with spacing $\delta$, except for the frequency shift $\alpha$. $$ \begin{align*} s(t) & = e^{i\alpha t} s_0(t) \\ s_0(t) & = re^{-i\delta t} + (1 - r^2) \sum_{k=0}^\infty r^k e^{ik\delta t}. \end{align*} $$ % .. which is the dual of truncated IIR filters. % The more general case of fractional linear transforms still has a % discrete spectrum. \subsection{Formants} Whenever we get a power series in $m(t) = e^{i\alpha t}$ we can shift it up and down the frequency spectrum by multiplying the signal with a polynomial in $m(t)$. The result will still be a power series in $m(t)$. Alternatively, the series can be combined with a polynomial through function composition, again yielding a harmonic power series. Such modulation cannot shift the \emph{spectra envelope} in steps smaller than the spacing of the harmonics. I.e. we would want to do this to simulate independent control of pitch and tone colour. However, it is possible to employ an interpolation trick by cross--fading\footnote{Combined with LFS this is essentially the Phase Aligned Formant (PAF) method.} between successive harmonics. The function that creates such a modulator from a base modulator $m$ and a fractional shift $x$ is given by $$c(m, x) = (x-[x]) m^{[x]} + (1-(x-[x])) m^{[x]+1}.$$ Here $[x]$ is the integral part of $x$, which makes $x_f = x-[x]$ the fractional part. Multiplying this with a power series in $m$ gives $$c(m, x) \sum a_im^{i} = \sum \big[ x_f a_i + (1-x_f) a_{i+1} \big] m^{i + [x]}$$ which gives another power series. When the two factors are synchronous, i.e. when when $m(t)=e^{i\alpha t}$, this indeed produces a harmonic spectrum with the $a_i$ fractionally shifted\footnote{This effect is analogous to fractional delay using linear interpolation. A straighforward observation is to extend the linear interpolation to a higher order polynomial interpolator. A polynomial fractional delay filter that preserves DC needs to have a zero at Nyquist to introduce the $180$ degree phase discontinuity when $a = 0.5$. Translating this to our problem, higher order polynomial interpolation of nearby harmonics is the same as amplitude modulation of an oscillator with frequency $x$ which has a phase discontinuity that is disguised by a vanishing waveshaper. The flatness of this waveshaper can be increased by increasing the order of the interpolation, which widens the bandwidth, and decreases amplitude modulation of the carrier wave.}. The function $c(e^{i\alpha t}, x)$ is periodic with a very narrow bandwidth, and so seems to behave like a single, localized oscillation in between harmonic $n$ and $n+1$. The method poses a problem when a time varying oscillator is implemented, i.e. if $m(t) = e^{i\phi(t)}$ where $\phi(t)$ is continuous but has a variable slope. Due to the locking, a change in frequency will cause a phase jump in at least one of the locked oscillators. This can be solved by smoothing out the discontinous frequency change, but not without additional frequency modulation. With locked oscillators, the frequency can only be changed without a phase discontinuity if it is done when the phase of the basic harmonic is $0$. \subsection{Power Series} From the previous it can be concluded that it is the division operation that introduces interesting harmonics. Let's continue with $(1 - re^{i\alpha t})^{-1}$ some more. An odd harmonic series can be synthesized using $$\frac{e^{i \alpha t}}{ 1 - r e^{i 2 \alpha t}} = \sum_{k=0}^\infty r^k e^{i (2 k + 1) \alpha t}.$$ Squaring the powerseries is equivalent to convolving the spectral envelope with itself. This gives a more localized formant structure. Taking the real part of $f$ turns the harmonic spectrum into a symmetric one that can be shifted up (or down) in frequency to build a formant like symmetric wave packet. Squaring the real signal before multiplication gives a smoother formant packet. If $r$ is complex, we can introduce a phase shift in the harmonics. Combining both $r$ and $\overline{r}$ we can construct damped sinusoidal spectral envelopes. This enables us to use standard sinusoidal methods to model a spectral envelope. If $f_{r,\alpha}(t)$ represents the power series or the DSF we see that $f_{\overline{r},\alpha}(t) = \overline{f_{r,-\alpha}(t)}$. [ TODO: Cleanup from here. ] Normalization can sometimes be desirable to get a constant power output over the modulation range\footnote{When using a ``squashed'' normalized DSF and allowing the modulation index to move beyond unit magnitude, we get a very nice sounding broad spectrum transient when $r$ passes through $|r|=1$. Especially so when the change in $r$ is not made instantanious, but i.e. processed by a lowpass filter (dezipped). During a fast change in $r$, the DSF normalization (which limits the number of harmonics to $N$) is not valid. This gives a large signal which then will be squashed by the normalizing function.}. If we define the inner product between two periodic signals $s(t)$ and $s'(t)$ with common period $T$ to be $\leftP_p=\frac{1}{T}\int_0^T \overline{s'(t)}s(t)dt$ and the power to be $P = \left$ and note that the components of the DSF are orthogonal, we can see the power is equal to \begin{equation} \sum_{k=0}^{N-1} |r|^{2k} = \frac{1-|r|^{2N}}{1-|r|^2} \end{equation} % Ik moet heel goed opletten dat ik de operaties niet door elkaar % haal! Namelijk: optelling, vermenigvuldiging en compositie. B.V. de % fractional linear transforms zijn een groep voor de compositie, maar % niet voor de vermenigvuldiging. Waveshaping is compositie van % waveshaper na oscillator. Filtering is een Fourier multiplier, geen % compositie. Spectral waveshaping is compressor! % De algemene rationale is dit: sferische ruimtes geven oscillatie, % terwijl hyperbolische ruimtes events genereren (fly-by). Nu die vage % onzin vertalen naar duidelijke termen... % Dit is te zien door de matrixvoorstellingen. % Unit norm complex numbers geven oscillatie op de eenheidscircel, % terwijl unit norm hypercomplex numbers een pad op de % eenheidshyperbool afleggen. % Dit is oscillatie vis a vis transienten. Als ik juist zit kan je met % clifford algebras directe sommen maken tussen zulkse ruimtes. Het % resultaat is, dat na normalisatie, de hyperbolische explosie er voor % zorgt dat de oscillaties een transient verloop hebben. \subsection{Normalization} The normalization function $$\sigma(z) = z/|z|$$ which maps $\CC$ to the unit circle, or its frequency--doubling square $$\sigma(z)^2 = z/\bar{z}$$ map $\CC \backslash \{0\}$ to the unit circle. \subsection{Normalized Addition} The operation $s(a,b) = \sigma(a+b)$ is interesting as it is closed on the unit circle and defined for all $a$, $b$ except $a = -b$. Is it possible to relate this somehow to the spectrum it produces? From ad--hoc experiments I can testify this yields very interesting sounds. \subsection{Conclusion} The basic idea is the same as in $\RR$. Polynomials of $e^{i\alpha t}$ produce harmonic spectra in a predictable way, just like polynomials of $\sin(\alpha t)$ and $\cos(\alpha t)$ do in $\RR$. However in $\CC$ polynomials and power series are a bit easier to handle. As long as one starts from coherent modulation, the story is quite straightforward. Polynomials allow the introduction of integer multiples of the base frequency, while power series (obtained by division or other non-polynomial analytical functions) represent broad spectra. This suggests an alternative organization: 1. investigate addition, multiplication, function composition driven by simple modulators, and 2. look at power series of some easy to compute functions. Basicly: what can polynomials do for you + what if you add division? \section{Other Manifolds} Investigate Clifford Algebras and certain matrix algebras. (i.e. quaternions, biquaternions, \ldots). Figure out how torsion and other non--geodesic leads to interesting paths through compact and connected manifolds. A lot can be done with division to generate interesting power series. Is it possible to change this to normalization (which has division), which is a more intuitive operation? Also, it seems that a lot of neat tricks are possible by using matrix embeddings of (compact) manifolds with a group action (Lie groups). Defining an additional operation as addition in the embedding space followed by projection onto the manifold subset in the embedding space seems to yield interesting ways of combining two elements in addition to the group action. Entry: Video Codecs Date: Tue Jul 7 09:23:02 CEST 2009 Time to get an overview of different techniques used in video codecs, to see how they relate to resource use. In general, MPEG-1 is mostly for video on CD, MPEG-2 is directed at TV (broadcast + DVD) with more emphasis on robustness, delay, different modes, ... and has an improved audio codec. MPEG-4 adds improved video coding and a more general-purpose decoder, but is a hodge-podge of optionally implemented features. Part 2 (divx) and 10 (avc) are important. MPEG-1 video [1]: - Group-Of-Pictures with Ineter-frame (keyframe) encoded as +- JPEG, Predicted-frame difference to previous frame incorporating motion vectors on macroblocks (16x16 = 4 luma 8x8 + 1 chroma 8x8). Bidirectional frame using forward and backward frame as reference. DC frames serve as "thumbnails" for fast-forward. - Motion estimation works on a fixed diamond region using quarter pixels. MVs are differentially encoded from neighbouring macroblocks (16x16 = 4 luma 8x8 + 1 chroma 8x8). - The DC part of the DCT coefficients is encoded differentially. The AC is coded in a zig-zag pattern (most energy is in the upper left corner around DC) which is then RLE encoded. Quantization uses 5 bits (0-31). Thresholding is adaptive (or user selectable). - The whole bitstream is Huffman encoded. MPEG-2 - Systems section: Transport Stream (lossy media like broadcast) and Program Stream (reliable media like DVD). - Video similar to MPEG-1, optimized for higher bitrates and more different formats (i.e. interlaced). Audio part contains AAC, which is more efficient, flexible and robust. (Part 2 = H.262) MPEG-4 - More advanced video coding + object oriented design. Decoder behaves more like a rendering engine. Variable block size motion compensation. - Part 2: DIVX - Part 10: H.264 / AVC (HD-DVD, Blue-ray) Many additions[9]. Highly nontrivial decoding/encoding. Other formats: - Theora (in Ogg container). Open but less well performing codec. - H.261 Low bit rate (ISDN) video conferencing. - H.263 Low bit rate video converencing. A variant (Sorenson H.263) is used in Apple Quicktime and Adobe Flash Video. Original base for Real Video. Part of 3GPP (MMS). [1] http://en.wikipedia.org/wiki/MPEG-1#Part_2:_Video [2] http://en.wikipedia.org/wiki/Advanced_Audio_Coding [3] http://en.wikipedia.org/wiki/MPEG [4] http://en.wikipedia.org/wiki/Flv [5] http://en.wikipedia.org/wiki/Theora [6] http://en.wikipedia.org/wiki/H.261 [7] http://en.wikipedia.org/wiki/H.263 [8] http://en.wikipedia.org/wiki/Video_codec [9] http://en.wikipedia.org/wiki/H.264/MPEG-4_AVC Entry: Binary Coding Date: Tue Jul 7 10:02:48 CEST 2009 [1] http://en.wikipedia.org/wiki/Huffman_coding [2] http://en.wikipedia.org/wiki/Entropy_coding [3] http://en.wikipedia.org/wiki/Arithmetic_coding [4] http://en.wikipedia.org/wiki/Context-adaptive_binary_arithmetic_coding Entry: MDCT Date: Tue Jul 7 10:54:32 CEST 2009 "Lapped" transform. Used in coding applications where overlap can be used to reduce artifacts. Note that H.264 uses several "exact-match" DCT variants. [1] http://en.wikipedia.org/wiki/Modified_discrete_cosine_transform Entry: Navier-Stokes Date: Mon Sep 28 14:47:26 CEST 2009 Time to look at the Navier-Stokes equation[1] and fluid dynamics. There is a description in the Princeton Companion to Mathematics[2] section III.23 p196. Another starting point is Feynman's Lectures on Physics, part II chapters 40-41. The Navier-Stokes (and Euler) equations are non-linear partial differential equations in terms of a velocity vector field u(x) and a pressure distribution p(x). The Euler equation is N-S with viscosity v=0. Apparently, even for small v these behave quite different. Additionally one can pose a constraint that expresses the fluid is not compressable. E and N-S are Newton's law applied to an infinitesimal portion of the fluid. The nonlinearity (which causes turbulence) is due to convective acceleration, which is an acceleration associated with the change in velocity over position. In practice, the interaction between large and fine scale behaviour requires such a fine mesh that the problem is often approximated, i.e. using the Reynolds-averaged Navier-Stokes equiation[3]. Following Feynman in chapter II.40 (The Flow of Dry Water). A liquid moves under shear stress. Hydrostatics, fluid in rest, amounts to the absence of shear stress. The important result here is that the force per unit volume is -grad(p). The important conclusion is that for a force defined by a potential energy, there is no general equilibrium solution if the density is not constant. (An exception arises when density depends only on pressure). For dynamic fluids, the clue seems to be to translate a velocity vector field description to a particle path description, and formulate Newton's law of conservation of momentum for that path. I.e. with dx denoting infinitesimals and a 2D equation. This yields an equation nonlinear in the velocity components. The force component of the N-S equation is where most of the variation lies. [1] http://en.wikipedia.org/wiki/Navier-Stokes_equations [2] isbn://0691118809 [3] http://en.wikipedia.org/wiki/Reynolds-averaged_Navier-Stokes_equations Entry: Material Derivative Date: Mon Sep 28 18:36:22 CEST 2009 Type: tex \def\duxx{\frac{\delta u_x}{\delta x}} \def\duyy{\frac{\delta u_y}{\delta y}} \def\duxy{\frac{\delta u_x}{\delta y}} \def\duyx{\frac{\delta u_y}{\delta x}} The nonlinearity of the Navier-Stokes equation comes from the expression of the derivative of a velocity vector along a particle path, starting from a velocity field. Let $n$ be the spatial dimension. Given a velocity field $u(x,t) = (u_1, \ldots, u_n)$ expressed in terms of $n$ spatial coordinates $x = (x_1,\ldots,x_n)$ and a time coordinate $t$, we can use the spatial and temporal partial derivatives $\frac{\delta u_i}{\delta x_j}$ and $\frac{\delta u_i}{\delta t}$ to make the following linear approximations in terms of time and space offsets. (1) The velocity vector at a space offset $\Delta x = (\Delta x_1, \ldots, \Delta x_n)$ and a time offset $\Delta t = t^{+} - t$ from $(x,t)$ is $(u_1^{+},\ldots,u_n^{+})$ with $$u_i^{+} = u_i + \frac{\delta u_i}{\delta t} \Delta t + \sum_{j=1}^n \frac{\delta u_i}{\delta x_j} \Delta x_j.$$ (2) A particle starting at a spatial point $x$ and time $t$, flowing according to the velocity vector field $u$ will after a time $\Delta t$ end up at the point $x^{+} = (x_i^{+},\ldots,x_n^{+})$ with $$x_i^{+} = x_i + u_i \Delta t.$$ Substituting $\Delta x_j = x_j^{+} - x_j = u_j \Delta t$ in (1) and dividing by $\Delta t$ then gives an approximation for the change of the velocity vector $\Delta u_i = u_i^{+} - u_i$ over a time interval $\Delta t$, along a flow path of the vector field $u$ through the point $(x,t)$. $$\frac{\Delta u_i}{\Delta t} = \frac{\delta u_i}{\delta t} + \sum_{j=1}^n \frac{\delta u_i}{\delta x_j} u_j.$$ The expression on the right hand side remains the same for the limit $\Delta t \to 0$ when the linear approximations are replaced by exact expressions, and is called the material derivative of $u$. For a generic (scalar-valued) multivariate function $f(x,t)$ and a velocity field $u(x,t)$ the fact that this is a simple application of the chain rule becomes more clear when we write the total derivative of $F(t) = f(x(t),t)$ as $$\frac{dF}{dt} = \frac{\delta f}{\delta t} + \sum_{j=1}^n \frac{\delta f}{\delta x_j} \frac{d x_i}{dt}.$$ Applying this for $f$ ranging over each component $u_i$ of $u$ then gives the previous expression. %[1] http://en.wikipedia.org/wiki/Substantive_derivative Entry: Informal proofs are difficult Date: Sat Oct 17 16:40:31 CEST 2009 This has bothered me for quite a while. Formal proofs can be quite tedious to read, but at least they will not put you in a position where you're reading a proof and you just can't see why a certain step is made, because the author assumed it was obvious. Granted, the ``obvious'' steps usually are obvious, that is from the perspective of somebody who understands the proof, or has a background in the matter (i.e. the justification that's left implicit is used a lot in other proofs). Entry: HSVD Date: Mon Nov 16 14:34:17 CET 2009 A summary of [1]. CHAPTER 2 Explains basic principles of Linear Prediction (LP) and State Space (SS) based methods for parameter estimation. The flow of HSVD is like this: 1. start with a signal of length N 2. create a hankel matrix (N-q x q) -> choose q 3. compute SVD 4. truncate: take first n singular vectors -> choose n 5. LS solve shift equation 6. compute eigenvectors The two values to be chosen are n (signal space dimension) and q (signal + noise space dimension). The n parameter follows from what one is looking for, but how to determine q? Translating LP to an AR problem shifts the statistics but makes it easier to estimate using LS. The essence seems to be this: trunctation by SVD is essentially noise reduction. This then makes the LS or TLS-based estimation for LP and SS approaches more precise. However, the statistically optimal (ML) estimates require nonlinear optimization. In general SS methods are better for larger number of poles since they obtain poles from an eigenvalue decomposition instead of polynomial rooting, which is ill-conditioned for large order. [1] ftp://ftp.esat.kuleuven.ac.be/pub/SISTA/vanhamme/phd.ps Entry: WSOLA - Waveform Similarity OverLap Add. Date: Tue Jan 26 18:15:18 CET 2010 Gist: An extension of the OLA time stretching algorithm where source frames are taken from a neighbourhood around the point a normal OLA would pick the frame. The offset is taken such that the overlapped part has maximal correlation. [1] http://etros1.vub.ac.be/Research/DSSP/Publications/int_conf/ICASSP-1993.pdf [2] http://music.columbia.edu/pipermail/music-dsp/2001-December/046763.html [3] http://mffmtimescale.sourceforge.net/ [4] http://diuf.unifr.ch/pai/people/juillera/PitchTech/Preview.html Entry: Never invert a matrix! Date: Wed Feb 3 17:22:40 CET 2010 Actually, there is a reason why one would want to invert it anyway: if the same matrix inverse is applied to a large number of vectors inversion might pay off because it can be parallellized. Solving a set of equations still has a longer data dependency chain. [1] http://www.johndcook.com/blog/2010/01/19/dont-invert-that-matrix/ Entry: Binary Streams as Stochastic Variables (PDFs) Date: Thu Feb 18 10:44:38 CET 2010 (EDIT: Mon Dec 27 11:20:35 EST 2010: check [2]: Need to make sure that multiplication doesn't cause any undesired spectral content to modulate into the signal band.) Goal: find a way to look at boolean functions on binary bitstreams coming from sigma-delta modulators. I.e. where the bitstream represents an analog signal that is the result of applying a reconstruction filter R after mapping the binary signal into the analog domain. 1. Assumption: we're only interested in the long-term average, so in first approximation we deal with stochastic variables, not stochastic processes. 2. The ``physical'' variable we're interested in is modeled as the expected value of the stochastic variable, which approximates the real implementation which is a running average / low-pass filter. 3. As another simplification we assume variables are not correlated. Using these simplifications, we can work with Probability Density Functions; a stochastic variable is completely specified by its PDF. Let's denote PDFs as this: a(v), b(v), c(v) PDF of single variable ab(v1,v2), ac(v1,v2) joint PDF of 2 varaibles abc(v1,v2,v3) joint PDF of 3 ... Since we use independent variables, joint PDFs can be written as products: ab() = a()b(). Additionally PDFs of binary stochastic variables can be represented by a single number \in [0,1] : the probability the variable is "1" (or "0"). We will abuse this by taking a to mean a(1), i.e. the number a \in [0,1] is the probability that variable a == 1. Using this notation, computing expected values of binary boolean functions is straightforward. I.e. for the XOR function: <+> | 0 1 ----+---- 0 | 0 1 1 | 1 0 E (a <+> b) = 1 ab(0,1) + 1 ab(1,0) + 0 ab(1,1) + 0 ab(0,0) = ab(0,1) + ab(1,0) = a(0)b(1) + a(1)b(0) [independent] = (1 - a) b + a (1 - b) = a + b - 2ab To make sense of this result it's simpler to have the binary values represent {-1,1}. The transformation F that maps [-1,1] into [0,1] is: F x = (1 + x) / 2 This then yields: a <+> b = a + b - 2 a b implies (F a) <+> (F b) = F (- (a b)) In other words: the XOR operation is _inverted multiplication_ of expected values of variables on {-1,1}. The function <=>, the negation of <+> gives a similar result. <=> | 0 1 ----+---- 0 | 1 0 1 | 0 1 E (a <=> b) = ab(0,0) + ab(1,1) = a(0)b(0) + a(1)b(1) [ independent ] = (1 - a) (1 - b) + ab = 1 - a - b + 2ab a <=> b = 1 - a - b + 2ab implies (F a) <=> (F b) = F (a b) The EQV operation is multiplication of EVs of vars on {-1,1}. It seems it's only interesting to look at equivalence classes under inversion of inputs and outputs. See also [1]. I.e. the only other interesting operation to look at is AND. Similar functions (OR,=>,NAND,...) will yield the same result qualitatively by inverting inputs or outputs. . | 0 1 --+---- 0 | 0 0 0 | 0 1 E (a . b) = ab(1,1) = a(1)b(1) = a b So AND,OR,=>,... are multiplications; just in a different domain [0,1]. Conclusion: these binary logic operations give bilinear functions. On the [0,1] domain they are binary linear combinations of the 4 terms: ab, a(1-b), (1-a)b, (1-a)(1-b), where a,b are the probabilities/expected values of the random variables a and b. -- Further work: * What happens when variables are not independent? Can the correlation itself do anything useful? For signals close to 1/2 of the range, i.e. represented by an alternating 01 sequence, it is difficult to be practically independent. However, two coin toss processes are independent, so this should be built-in at the core somehow. The question then becomes: how to implement a modulator? I.e. how to introduce bias from a uniform random source, given the desired expected value. I.e. is it possible to take a structured signal and decorrelate it with a noise process? Doesn't seem to be straightforward, i.e. XOR or EQV with a uniform process are equivalent to multiplication with 0. It seems that one of the a-symmetric operators is needed to create non-uniform distributions from uniform ones. I.e. the AND of two 1/2 noise streams gives a 1/4 noise stream. * Can we use the apparent dependency on randomness (i.e. from noise injection) to transmit other information? * What about looking at these functions as ``modulation maps'' instead of binary endomaps of [0,1] or [-1,1], i.e. as: [0,1] x [-1,1] -> [-1,1] I.e. one of the inputs represents an asymmetric signal, while the other one is symmetric. Some literal Haskell code: > a `xor` b = a + b - 2 * a * b > a `eqv` b = (1 - a) * (1 - b) + a * b > > f x = (1 + x)/2 > > a === b = (abs (a - b)) < 1e-14 > > e_xor a b = ((f a) `xor` (f b)) === (f (- (a * b))) > e_eqv a b = ((f a) `eqv` (f b)) === (f (a * b)) [1] entry://20090609-145519 [2] entry://20101227-110515 Entry: Rhythm & Dance Date: Thu Apr 8 13:19:24 EDT 2010 Instead of looking at rythm as a series of events to be interpreted by a listener, it might be simpler to look at it as a series of motions, i.e. each up needs a down. The point being: "performability" might be a key element in rhythm. Performability meaning: what you hear is not a sequence of sounds, but a representation of the drummer's hand movements. I.e. rhythm _needs_ to be dancable. From this point it might make sense to approach rhythm from the phisical system way (choreography) as opposed to the auditory perception way. I.e. biological motion. A good base migh be to start from acceleration (muscle tension) and connected body dynamics (joints produce circular behaviour). Entry: Euler-Lagrange equations Date: Fri Apr 9 13:53:14 EDT 2010 Type: tex From Arnold[1] chapter 3, with slightly different Haskell-inspired notation. Suppose a functional $\Phi :: (\R \to \R) \to \R$ can be written as $$\Phi(x) = \int_{t_0}^{t_1} \! L(x,x',t) \, dt$$ where $L :: (\R,\R,\R) \to \R$ is a function of 3 variables and $x' = {dx \over dt}$ denotes the derivative of the curve $x :: \R \to \R$. The function $x :: \R \to \R$ is an \emph{extremal} of the functional $\Phi$ if and only if $$\frac{d}{dt}\left\frac{\partial L}{\partial x'}\right - \frac{\partial L}{\partial x} = 0.$$ The term $\frac{\partial L}{\partial x'}$ means the partial derivative of function $L(x,x',t)$ of 3 variables to its second variable. Substituting variables, this denotes the same function as $\frac{\partial}{\partial v_2}L(v_1,v_2,v_3)}.$ The result of the partial derivation operation is a new expression in terms of $(x,x',t)$. In this expression, one subsitutes the unary function $x(t)$ and its derivative $x'(t)$, yielding a unary function of $t$, of which the derivative to $t$ can be computed. Note that this is 300 year old sloppy notation. See next post and SICM[2], where Spivak's unambiguous notation[3] is used. This theorem is readily generalized to a multi--variate functional $\Phi :: (\R \to \R^n) \to \R$ operating on $n$--dimensional curves $:: \R \to \R^n$ and a function $L :: (\R^n, \R^n, \R) \to \R$ of $2n+1$ variables. Summarized, if a functional only depends on \emph{positions}, \emph{velocities} and \emph{time}, the Euler--Lagrange equation can be used to find concrete equations for an \emph{extremal} curve of the functional. %[1] isbn://0387968903 %[2] http://mitpress.mit.edu/SICM/ %[3] isbn://0805390219 Entry: Sussman about crappy notation Date: Fri Apr 9 20:03:34 EDT 2010 In the video[1] Arnold's book[3] is mentioned two times (and that he's a SOB). Sussman doesn't like the ambiguous notation. Arnold[3] p246: (not breaking with tradition and explaining what the Langrange equations mean) and p258: ``It is necessary to use the apparatus of partial derivatives, in which even the notation is ambiguous.'' Sussman uses the notation based on Spivak's Calculus on Manifolds[5] which handles it properly: partial differentials expressed in terms of position of parameter instead of names that can have different meanings (parameters or functions). See p.44 at the end of the chapter about differentiation[6]. His point: programming forces one to be precise and formal without being excessively rigorous. [1] http://video.google.com/videoplay?docid=-2726904509434151616&hl=en# [2] http://mitpress.mit.edu/sicm/ [3] isbn://0387968903 [4] http://www.docin.com/p-708956.html [5] http://en.wikipedia.org/wiki/Calculus_on_Manifolds_(book) [6] http://books.google.com/books?id=POIJJJcCyUkC&pg=PA44#v=onepage&q&f=false Entry: Legendre Transform Date: Sat Apr 10 09:47:32 EDT 2010 Type: tex \newcommand{\ip}[2]{\langle {#1} , {#2} \rangle} The Legendre Transform (LT) shows up in classical mechanics as the link between the Legendre equations and Hamilton equations. Some notes about what this transform actually does. First, how to compute? The LT of $f(x)$ is defined as the function $g(p) = \max_x F(x,p)$ where $F(x,p) = px - f(x)$. For a convex function the maximum occurs where $\nabla_x F(x,p) = 0$ or $p = \nabla_x f(x)$. Solving the latter equation yields an expression for $x(p)$ which can be substituted in $F$ to yield $g(p) = F(x(p),p)$. \begin{itemize} \item The LT transforms a convex function on a vector space to a convex function on the dual vector space. The coordinates $p = \nabla x$ can be seen as a representation of this dual space. (TODO: less handwaving please). \item Two functions make up an LT transform pair if their first derivatives are inverses of each other. Note that the derivative of a convex function is a monotone function and therefore invertible. \item For a quadratic form $f(x) = x^T F x$, we have $p(x) = 2 F x$. After subsituting $x(p) = \frac{1}{2} F^{-1} p$ in $g(p) = \ip{p}{x(p)} - f(x(p))$ we get $g(p) = \frac{1}{4}p^TF^{-1}p$. Note that the respective derivatives $x \to 2Fx$ and $p \to \frac{1}{2}F^{-1}p$ are inverses. \item For a QF we have additionally that $g(p) = f(x)$ or that $f(x) = \ip{x}{p(x)} - f(x)$ or $f(x) = \frac{1}{2}\ip{x}{\nabla f}$. This is not true in general. \item The Langrangian of a physical system $L = T - U$ is (assumed to be) convex w.r.t. the (generalized) velocities. This seems to make general sense: the kinetic energy grows for velocities that go to infinity, \emph{independent of direction}. In addition, $T$ will likely be a quadratic form. Note that even if the generalized coordinates live in a non-flat manifold, the velocities will still be vectors in the tangent space, and kinetic energy is expressed in terms of the weighted magnitudes of these vectors (a positive definite QF). \end{itemize} %[1] http://arxiv.org/pdf/0806.1147v2 %[2] isbn://0387968903 %[3] http://en.wikipedia.org/wiki/Legendre_transformation Entry: Computer Algebra System (CAS) Date: Sat Apr 10 16:21:09 EDT 2010 From what I gather, a generic "simplify" method is a collection of hacks, because there is no generic notion of what "simple" actually means. Side-stepping that issue, particular transformations are usually well-defined. From my own limited experience with static analysis & abstract interpretation, it seems best to map expressions between different domains. From an engineering point of view it is almost always simpler to split any transformation stap in at least two steps, and have formal intermediate languages. For any transformation step: * Convert equivalent expressions (L1) to a unique normal form (L2). * Express complicated transformations as directed equalities (pattern matching functions) on normal forms (L2) to some target domain (L3). * Optionally, re-embed the target domain (L3) into the original (L1) or some other domain. Entry: What is a differential? Date: Mon Apr 12 12:52:09 EDT 2010 Type: tex A differential of a function on a manifold $f :: M \to \R$ is a linear functional $df_p(t) :: T_p \to \R$ defined on the tangent space $T_p$ at $p \in M$. Entry: Conservative Systems and Chaos Date: Mon Apr 12 21:08:26 EDT 2010 Type: tex Apparently it is possible to have chaotic behaviour in a conservative (hamiltonian) system. A HS conserves phase-space volume, but not necessarily shape. Think of kneading dough: volume is conserved, but an initial ball of raisins spreads out. How to make an example? It needs to be a-symmetric to avoid integrability. Take a potential like $$U(x,y) = x^2 + y^2 + \frac{a}{1 + (x-b)^2 + y^2}$$ which looks like a harmonic oscillator potential well $x^2 + y^2$ at large scale, but has a bump at $(b,0)$. Can it produce chaotic behaviour? For simple systems, can the Lyapunov coefficient (LC) be computed locally? Sure it can, but what matters more is the \emph{average} LC, no? Need more study here\ldots Entry: d'Alembert Principle Date: Sun Apr 18 13:24:28 EDT 2010 Basic idea: the ``constraint force'' that keeps trajectories on a configuration manifold embedded in Euclidian space can never perform any work, i.e. change the energy of the system. This is the same as saying that the constraint force is always orthogonal to the tangent space of the configuration manifold. Entry: Spinors and Simplectic Geometry (Hamiltonian Mechanics) Date: Wed Apr 28 22:41:41 EDT 2010 Ever since an ex-collegue of mine made the remark that he heard some physicist friend of his say that ``I wonder why spinors aren't used more in signal processing'', I'be been intrigued by the idea but didn't really see the connection. Lately I've been looking into classical mechanics a bit more, and I'm about to enter the Hamiltonian Mechanics part. I take an interest in this as I'd like to see the links with sound synthesis. [1] http://en.wikipedia.org/wiki/Spinor [2] http://en.wikipedia.org/wiki/Symplectic_geometry [3] http://en.wikipedia.org/wiki/Linear_symplectic_space Entry: Differential Music Date: Fri Apr 30 16:17:58 EDT 2010 Type: tex In an attempt to create music in the form of differentiable functions on tori, I ran into the problem of simulating events anyway. The apporach that seems to work well is to apply a function that would return a smooth pulse wave when applied to $x(\theta) = cos \theta$. This lead to $$x \to \frac{1}{1 + k^2(1 + x)}.$$ Another one that might work is $$x \to \frac{1}{1 + (kx)^2}.$$ The reason for using differentiable functions is that they open the door for local analysis of parameter spaces of sound generating networks, and other analytic (computer algebra) techniques. The next question is: how to bring rhythm into the game? Entry: Cleaning up PhD research papers Date: Sun May 30 13:33:15 CEST 2010 I'm keeping general introductory papers, and audo synth + FX papers. For sinusoidal modeling, Petre Stoica seems to be a good starting point for generic approach. The other direction was Sabine VanHuffel for the fast algorithms. For wavelets it's Daubecies and Sweldens, I'm keeping some introductory papers. I'm throwing away the paper forms of specific ad-hoc papers about: - Blind source separation (statistics based) - CAS (computational source separation: perception model based) - Computational Sciene Analysis - Matching Pursuit (iterative filtering) - Audio Coding - Sinusoidal + complex exponential modeling - Wavelets + Applications to approximate LU I was not able to integrate most of that knowledge during my PhD years because of the many ways to characterize errors (what mathematical framework to use to express the modeling problem) and the non-linearity of the resulting optimization problems, which makes practical comparison quite difficult. Possible variations: amplitude or amp+freq estimation, polynomial phase, optimality (matching vs. linear models + error), noise coloring, etc. I'm not keeping Matching Pursuit papers: this method is too ad-hoc, and forgive my arrogance but I think I can re-invent most of this technology if I'm ever in need of it. I'm not keeping the sinusoidal modeling papers (peak picking, etc..). Same story as with MP: too ad-hoc and re-inventable. I'm limiting myself to more mathematically meaningful structures. I've vowed to not set a foot in the theatre of perception ever again! (I.e. speech recognition: most of this technology needs extra information about how the brain works; as it is that what we want to mimick in the first place.) Entry: Wavelets : local trigonometric basis Date: Sun May 30 14:09:19 CEST 2010 I'm keeping some wavelet and transform papers. Might be interesting to look into. Entry: Exponential sinusoidal modeling Date: Wed Jun 2 15:05:45 CEST 2010 I'm keeping the exponential modeling papers. Most sinusoidal modeling problems also suffer from ambiguities (optimization problems with many similar local minima), which require disambiguating constraints and assumptions to become useful. However, these can usually stay a bit analytically tractable. I'd like to eventually work out some kind of survey/indexing into this kind of research. Entry: Exponential modeling Date: Sat Jun 19 13:38:34 CEST 2010 I worked on sinusoidal modeling for a while, and if there is one thing that I can remember, it is that there is a lot of wiggle room in how to tackle the problem on the technical side, and that what we hear isn't so clearly related to what we can measure: our perception fills in a lot of gaps. Sinusoidal modeling highlights: * Fourier transform and DFT (FFT). The FT is interesting for reducing the complexity of convolutions, performing a whitening transform for adaptive filters (as a filterbank), and as an ad-hoc method for analysing harmonic sounds. * Autoregressive models + ladder filters (orthogonal polynomials and Shur's algorithm). The theory behind this is quite pretty. For structured matrices, modified Shur and rank-revealing decompositions can be used to yield fast algorithms. * Non-linear phase signals. Moving from linear phase (exponential sinusoidal) to chirp and higher order polynomial phase adds more complexity. There doesn't seem to be much structure to explore here. * Nonlinear optimization. Writing sinusoidal modeling as a generic optimization problem can use the fact that derivatives of exponentials in general do not look horrible. However, these functions are usually multi-modal, and give rise to ambiguities, suggesting that there are "many ways to see a signal" as a sum of sine waves. * Subspace based techniques. I see two distinct classes: one that operates directly on the signal/noise subspaces obtained through the SVD decomposition of signal marices, and one that uses signal spaces to further derive approximate linear system parameters (sinusoidal parameters). Entry: Subspace Identification Date: Sat Jun 19 15:02:00 CEST 2010 Subspace Identification[1] refers to system identification of linear MIMO dynamical models using linear algebra techniques (QR + SVD). The basic idea revolves around a two-step procedure: 1. Given input/output data, construct a _Kalman state sequence_ by projection onto a limited subspace. 2. Obtain state matrices from this using linear least squares. The fact that 1. is at all possible provides the main leverage. ( The following is generalized from the HSVD method. I'm not sure if it completely applies to the stochastic version in [1], but I have the impression it does. I need a more intuitive grasp of the Kalman filter first. ) Such SVD-based methods are good in _practice_ but the characterization of the approximation is theoretically very far removed from what can be obtained by maximum likelyhood methods using a more direct approach. Subspace methods have been used to categorize the "unreasonable effectiveness" of mathematics. I'm definitely no expert, but it seems to me that they fall more into the class of convenient, accedental hacks. (The link between the estimation error and an ML approach is not clear : probably there is none due to radical re-interpretation of the problem?) [1] ftp://ftp.esat.kuleuven.ac.be/pub/SISTA/nackaerts/other/alln.ps.gz Entry: Maximum Likelihood estimation and the Kalman Filter Date: Sat Jun 19 19:25:07 CEST 2010 Type: tex ( See [6] for a good introduction to ML and Bayesian estimation. ) Maximum likelihood estimation (MLE) is a way to estimate model parameters based on parameterized probability density functions. \begin{enumerate} \item Construct a conditional model $P(x|\theta)$ which gives the probability density function (PDF) of observables $x$ in terms of model parameters $\theta$. \item Interpret the PDF as a function $L(\theta) = P(x_0 | \theta)$, setting the observables to a particular observed outcome $x_0$, and find the $\theta_0$ that maximizes this function. \end{enumerate} This gives a $\theta_0$ that \emph{best explains} the data, as the probability of observing $x_0$ is highest for the parameter vector $\theta_0$. A simple example of MLE is linear least squares estimation with uniform noise assumption. From [6] p. 91, the Kalman filter arrises in a Bayesian framework from calculating updates of conditional probabilities, taking into account the next observation step. In the case where PDFs of parameter priors and noise sources are gaussian, there is a efficient update mechanism to compute parameter representation of these PDFs (mean and covariance matrix). In general the KF gives the best possible linear estimator in a MSE sense. In [6], p. 71-77 the differences between ML, MMSE and MAP estimators is explained using the concept of risk[7] which combines cost with probability. An estimator minimises risk. Different cost functions give rise to different estimators. In [8] a brief explanation is given about how one arrives at the Kalman filter relations from a recursive Bayesian setting. % [1] http://www.tina-vision.net/docs/memos/1996-002.pdf % [2] ftp://ftp.esat.kuleuven.ac.be/pub/SISTA/nackaerts/other/alln.ps.gz % [3] http://en.wikipedia.org/wiki/Maximum_likelihood % [4] http://en.wikipedia.org/wiki/Likelihood % [5] http://en.wikipedia.org/wiki/Ordinary_least_squares % [6] http://www-sigproc.eng.cam.ac.uk/~sjg/book/digital_audio_restoration.zip % [7] http://en.wikipedia.org/wiki/Risk_%28statistics%29 % [8] http://en.wikipedia.org/wiki/Kalman_filter#Relationship_to_recursive_Bayesian_estimation Entry: Carette's Implicit Model Specialization Date: Tue Jun 22 12:22:06 CEST 2010 Remarks from reading [1]. * Model transformation is straightforward, but the selection of the techniques still requires human insight. (I.e. too many degrees of freedom to automate). * Not only computation, also deduction is necessary (i.e. to prove that a particular computation can be eliminated). * Look into active libraries[2] and telescoping[4]. [1] http://www.cas.mcmaster.ca/~carette/newtongen/ [2] http://mozart-dev.sourceforge.net/activelib.html [3] http://awurl.com/SvpJqXPwZ [4] http://telescoping.rice.edu/ Entry: Digital Audio Restauration Date: Tue Jun 22 15:14:45 CEST 2010 Digital Audio Restoration - A Statistical Model-Based Approach by Simon J. Godsill and Peter J. W. Rayner[1]. It contains interesting introduction on Bayesian and ML estimation. [1] http://www-sigproc.eng.cam.ac.uk/~sjg/book/digital_audio_restoration.zip Entry: Annoyed by Computer Modern Date: Tue Jun 22 15:18:31 CEST 2010 I have this book [1] in print. The printed version is non-glossy paper, and it has a good looking "spill". Computer modern is too thin on the screen and on a laser printer. Was it designed with this spill in mind? TAOCP has less spill, non glossy but finer paper, but looks better than laser print. [1] http://www-sigproc.eng.cam.ac.uk/~sjg/book/digital_audio_restoration.zip Entry: Exponential sinusoidal modeling Date: Thu Jun 24 17:00:53 CEST 2010 Type: tex There are many ways to formulate the exponential sinusoidal modeling problem. It is important to distinguish between the \emph{stochastic} model, and the \emph{estimator}. A basic element of all approaches is the linear prediction (LP) or autoregressive (AR) relation $$y_n = \sum_{i=1}^N a_i y_{n-i}.$$ Placing this in a statistical framework can yield many variants. I.e. the stochastic AR model $$y_n = \sum_{i=1}^N a_i y_{n-i} + e_n,$$ and a noisy sum-of-sinusoids model $$x_n = \sum_{i=1}^N a_i x_{n-i} \text{ and } y_n = x_n + e_n.$$ The difference between these is that in the former, the noise signal can be seen as driving the input of a dynamical system, while in the latter there is only measurement noise. For each stochastic model, several estimators can be created. An estimator transforms an observation into a guess for the corresponding parameters, depending on known or estimated statistics. In the case of the AR model, when the signal covariance matrix is known, the AR model can be solved exactly. It is possible to solve the model in the LS sense, which is equivalent to building an estimate for the signal covariance. Entry: Superfast Toeplitz algorithms Date: Fri Jun 25 10:28:42 CEST 2010 The structure in linear algebra problems involving Toeplitz matrices can be exploited using generalizations of the Levinson[2] and Schur algorithms[4]. Further exploitation is possible using superfast algorithms based on the FFT or DCT. [1] http://people.cs.kuleuven.be/~marc.vanbarel/software/ [2] http://en.wikipedia.org/wiki/Levinson_algorithm [3] http://www.math.niu.edu/~ammar/cortona/cortona.html [4] http://nalag.cs.kuleuven.be/papers/ade/BWG93/ Entry: Displacement Rank & Generalized Schur Date: Fri Jun 25 14:58:42 CEST 2010 Type: tex Reading Mastronardi's PhD diss[1]. One of the things that has put me off before is the horrible notation. Many of the formulas talk about symmetric matrices, which yields a lot of $AJA^T$-style terms. Isn't there a more readable way to write this down? Anyway, the algorithms are quite elegant: instead of operating on a matrix $R$, one performs elimination operations on generators only, reducing the complexity order by one. What is interesting though is that one leaves the domain of signal algorithms to be able to use matrix algorithms, but one re-enters the domain of signal algorithms in the implementation, recovering some of the matrix structure. % [1] http://users.ba.cnr.it/~irmanm21/Welcome.html Entry: Predict and Update Date: Thu Jul 8 10:11:26 CEST 2010 If you look at the equations for the Kalman filter[1], it is remarkable that they are factored in two steps: 1. predict output using current input, previous state estimate and system equations. 2. update state estimate and noise statistics by observing the real (noisy) system output. [1] http://en.wikipedia.org/wiki/Kalman_filter Entry: Binary clocked modulation & averages Date: Thu Jul 22 07:26:44 CEST 2010 About this[1].. The problem seems to be that there isn't really much "space" when signals are binary and clocked; i.e. there is a finite amount. Maybe instead of trying to use infinite techniques (statistics: limits) it's better to look at properties of discrete structures. [1] entry://20100218-104438 Entry: CRC algo Date: Fri Jul 30 12:03:25 CEST 2010 What's with the table of coefficients in this CRC32 implementation? ( code: COPYRIGHT (C) 1986 Gary S. Brown. ) [1] https://aachen.uni-dsl.de/svn/unidsl_firmware/backfire/trunk/backfire/tools/firmware-utils/src/cyg_crc32.c Entry: Minimal erase counter Date: Tue Aug 3 09:06:50 CEST 2010 How to make a counter that has a minimal number of erases (0->1 transitions). i.e. 1111 erase 1110 1100 1000 1101 erase 1001 0001 ... It boils down to finding a sequence of paths through the bit-flip hypercube from 111... to 000... This is probably not unique, but how to find a representative with a simple structure? I.e. how to turn the directed (1->0) hypercube graph into a tree? (And step 2, how to order the branches of that tree?) The idea is to give each node only a single parent. I.e. given a bit vector, there is an algorithm to compute its parent. 10010011 -> 11010011 101001 -> 111001 Entry: Quadratic Residues Date: Sat Aug 7 20:25:32 CEST 2010 ( From conversation with Antti; looking up some background information.. ) Exactly half of the integers in range [1..(p-1)] are quadratic residues (QR), and the other half are non-residues of prime p. q is QR if there exists a x : x^2 = q (mod p) In other words, x is the square root of q in the field Z/pZ. I.e. for p = 5. 1^2 = 1 2^2 = 4 3^2 = 4 (9) 4^2 = 1 (16) Meaning 1 and 4 are QRs and 2 and 3 are not. Why is this? Each number can be written as g^n, where g is a generator of the cyclic multiplicative group of the field Z/pZ. If n is even, g^n has a square root as g^(n/2). With the exception of p-2, exactly half of the elements of the field Z/pZ has a square root, as there are always an even number of elements in the multiplicative group, which has order p-1. [1] http://en.wikipedia.org/wiki/Quadratic_residue Entry: Peak compression Date: Sun Aug 8 06:57:58 CEST 2010 For many applications that have constraints on bandwidth and dynamic range at the same time, it is useful to be able to control the maximal signal amplitude. The simplest way to produce wide-band constant-amplitude pulses is to use chirps. Another intriguing way, when the spectral content is known, is to use Schroeder phases[4][5]. ( Using n(n-1) phase offsets -- compare to n^2 for Newman phases. ) Now the interesting part is that there is a _digital_ variant of this. Here one starts from a signal with minimal dynamic range: a (PSK modulated) binary signal. Binary can only represent 1 and -1. When modulated as a PSK signal it is an analog signal with a small dynamic range. (Smallest known? See BER[3]). A Barker code[2] then is a sequence that is near-orthogonal to its own shifts, meaning that its autocorrelation has a distinct peak at shift=0, but has maginitude <= 1 for other shifts. (Apparently, orthogonality is too strict a constraint). Local refs: Optimal Binary Sequences for Spread Spectrum Multiplexing[6]. Synthesis of Low-Peak-Factor Signals and Binary Sequences With Low Autocorrelation[7]. [1] http://en.wikipedia.org/wiki/Pulse_compression [2] http://en.wikipedia.org/wiki/Barker_code [3] http://en.wikipedia.org/wiki/Bit_error_rate [4] http://books.google.be/books?id=hQ6bl3RG04sC&pg=PA290&lpg=PA290&dq=schroeder+phases&source=bl&ots=7IT7kYGcyu&sig=45_66hBv0Xxo1yw9YBlyXVU0usY&hl=nl&ei=oT5eTJ-IJIX80wSc-8XHBw&sa=X&oi=book_result&ct=result&resnum=6&ved=0CEIQ6AEwBQ#v=onepage&q=schroeder%20phases&f=false [5] http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=1054411 [6] md5://c35e337e61dfbcd980e3728768f71af4 [7] md5://9b2913693c005de2515e3efeb8548e47 Entry: Schroeder phases and quadratic residue diffusors. Date: Sun Aug 8 07:30:28 CEST 2010 What is the idea behind a quadratic residue diffusor (QRD)? What is the link with Schroeder and Newman phases? ( Is there any link with the "maximal irrationality" of the golden ratio. Max. irr. here means partial fraction expansion is all ones which means that rational approximation converges as slow as possible. ) In [2] it is mentioned that surprisingly, the DFT of the exponentiated sequence (amplitude 1 and n^2 mod p used as the _phase_) has a constant magnitude. Why is this? [1] http://www.dpi.physik.uni-goettingen.de/~mrs/Vortraege/Remembering-the-Good-Days-at-Bell-Laboratories-2004/index.html [2] https://ccrma.stanford.edu/~kglee/pubs/2dmesh_QRD_rev1/node2.html Entry: Finite Field Automorphisms Date: Sun Aug 8 08:44:55 CEST 2010 A binary LFSR sequence ultimately comes from a cycle in the multiplicative group of a finite field GF(2^n). This is a unique mathematical structure with a bunch of representatives that are all isomorphic. So, what does the automorphism group of a finite field look like? One place to look for some intuition is in Gold Codes[1]. This is the only practical application I know where two sequences with different generator polynomials are combined to form a new sequence with useful properties. Interesting properties of Gold codes: * Low autocorrelation (this is a "meaningless" property, as they are selected * The XOR (+) operation is closed PROOF: Say s1(n) and s2(n) are LFSR sequences of the same lenght. A gold code is constructed as g(n) = s1(0) + s2(n). We then have g(a) + g(b) = s1(0) + s2(a) + s1(0) + s2(b) = s2(a) + s2(b) = ... ??? (hmm... it seemed obvious - not awake yet) A link I found about finite field automorphisms[3]. Time to finally get into Galois Theory[4]. But what do I know? It has to do with the structure of the multiplicative group. I would think that the symmetry group gets larger if the multiplicative group is very composite. So, which GF(2^n) have prime order multiplicative groups? Are there primes that look like 2^n-1? Mersenne primes[7]. But is it really primes we're looking for, not maximally composite numbers? [1] http://en.wikipedia.org/wiki/Gold_code [2] http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=1054048 [3] http://everything2.com/title/automorphisms+of+finite+fields [4] http://en.wikipedia.org/wiki/Galois_theory [5] http://en.wikipedia.org/wiki/Safe_prime [6] http://en.wikipedia.org/wiki/Strong_prime [7] http://en.wikipedia.org/wiki/Mersenne_prime Entry: More PRN sequences: 1/x as fractional or p-adic number. Date: Sun Aug 8 10:13:06 CEST 2010 What I'm looking for are binary sequences that are naturally periodic and somehow "obvious". An interesting place to look is the fractional digits of 1/x, where x is some special number. At some point the digits start repeating. Apparently pseudo random noise generation is an application of Sophie Germain primes[2]. A prime number p is a Sophie Germain prime if 2p + 1 is also prime. ( Instead of the fractional expansion, can we also use the p-adic expansion? ) Constructing the bit sequence is done using long division. The period can be obtained from observing that in base b: 1 / (b^n - 1) = 0.0...10...10... <---> n digits For a number 1/q we find the smallest n such that q * x = b^n - 1 Here x is the repeated pattern in the fractional expansion of 1/q, one period of the pseudo random number sequence we are interested in. How can we make sure that n is as large as possible? In other words, find a q that maximises n, where n is the smallest number such that q divides b^n - 1. Rewriting it makes it a bit more clear. b^n = 1 + q * x -> b^n = 1 (mod q) How to make n as large as possible? Let's start with q a prime number. Then this becomes a statement about the multiplicative group of the field Z/qZ, which has order q-1. The multiplicative group of a finite field is cyclic and commutative, so completely determined by its order. Now, when q is a Sophie Germain prime of the form 2p + 1 the multiplicative group order is 2p. There are 3 possible cycles in this group, with periods 2, p, and 2p. What needs to be verified to make sure we have the maximal cycle length 2p is that b is a generator of this cycle (b is a primitive element), meaning b^2 != 1 mod q b^p != 1 mod q For b = 2 the first one is true when q > 3. The second one can be verified for given p and b, but probably also proven generally (??). [1] http://en.wikipedia.org/wiki/Sophie_Germain_prime [2] http://mathworld.wolfram.com/DecimalExpansion.html Entry: LFSR beacon detection Date: Sun Aug 8 10:35:36 CEST 2010 Actually, this might be interesting to use together with the krikit idea. Or how to make a variant of GPS using LFSR audio-beacons. Problem: we have plenty of bandwidth (say 3 - 10 kHz) but not dynamic range (environment noise, and "noisyness" of the transmitter). We want to transmit a low bitrate signal. Solution: using spread-spectrum techiques we can trade bandwidth for dynamic range. A PSK baseband signal after heterodyne filtering by the carrier wave is a complex phase signal. It might have some small residual modulation that is easy to track, so let's ignore it. The basic idea is to use the LFSR to perform a local prediction for the I and Q signals separately, keeping the operation itself linear. (I.e. if more than one bit of information is available, it can be used.) We integrate the output of this predictor to yield a slow-varying information carrier. A Schmitt trigger on the output of the integrator should be enough. Entry: Chirps from corrugated galvanised iron (golfplaat) Date: Sun Aug 8 11:58:43 CEST 2010 An open barn/shed with 3 walls made of corrugated galvanised iron[1], with lines arranged vertically, produces chirp impulse response at an angle of 45 degrees with the walls. Why? [1] http://en.wikipedia.org/wiki/Corrugated_galvanised_iron Entry: Size of GF(2^n) symmetry groups Date: Wed Aug 11 21:38:04 CEST 2010 I think I'm stuck with a false fact in my head.. Are the number of distinct irreducible polynomials and the size of the symmetry group of GF(2^n) actually the same? [1] http://oeis.org/wiki/A011260 [2] http://oeis.org/wiki/A000031 Entry: Binary Tree Permutations Date: Sat Aug 14 14:25:38 CEST 2010 ( Of relevance to understanding PF's memory model [1]. ) I'm trying to understand the group of binary tree permutations. Let's start with these two, intuitively basic operations: * swap: a b S = b a * rotate: a (b c) L = (a b) c (a b) c R = a (b c) Notation: Lower case letters or digits indicate trees, upper case letters are tree transformations. Group action of transformation T on tree t is denoted as tT. Binary trees have only one non-associative operator: (t1 t2) is the tree made from splicing together the subtrees t1 and t2. The relation t1 -T-> t2 is shorthand for t1 T = t2. Questions: 1) is this set minimal? and 2) does it generate all permutations of binary trees? For convenience, let's stick to permutations of infinite trees, but have them go only finitely deep. This means the group of transformations is infinite in size but more "regular" than a transformation group of finite-size trees as there are no special border conditions. The answer to 1) is no. It is possible to write L = SRSRS. Proof: a (b c) -S-> (b c) a -R-> b (c a) -S-> (c a) b -R-> c (a b) -S-> (a b) c We also can't write S in terms of L and R because both L and R leave invariant the ordering of the leaf nodes, while S does not. To understand more about 2) let's see what the relation is to lists. Suppose that a binary tree can be transformed to a (left or right leaning) list by a succession of L and R operations. The inverse is then also true: any list can be transformed back into an arbitrary binary tree with the same leaf-node order using a particular combination of L and R transformations. So is the premise true? Or do we need another operation? From inspection, it seems that rotating a tree to a (right leaning) list can be done by R operations on _subtrees_. The question is then: how to express R operations on subtrees? I.e. how to express: a ((b c) d) -> a (b (c d)) More generally, how to take a tree transformation T and re-root it at a different base node? Intuitively this seems to be a new operation as you "swipe something under the carpet". Let's try: if the basic operations of S,L,R can be constructed to operate only on the right (or left) node of the tree, expressed in terms of root-node S,L,R operations, then we can recursively transform any operation. r (a b) -S1-> r (b a) But I don't immediately see how to construct S1. My first attempt brought me to some other primitives: 3-element right-leaning rotations LS and SR, which are each other's inverse: 1 (2 3) -L-> (1 2) 3 -S-> 3 (1 2) 3 (1 2) -S-> (1 2) 3 -R-> 1 (2 3) The left-leaning variants are RS amd SL. (1 2) 3 -R-> 1 (2 3) -S-> (2 3) 1 (2 3) 1 -S-> 1 (2 3) -L-> (1 2) 3 Because they are embeddings of 3-element list rotations we have: (ST)^3 = (TS)^3 = I where T is L,R. From these basic operations I can't see how to implement the re-rooted swap operation: 1 (2 3) -> 1 (3 2), which seems to be a genuine primitive transformation. This reminds me of the hint at necessity of '>R' and 'R>' or 'dip' when writing Forth or Joy code: you need a second stack to hide intermediate results. (Additionally you need a "free" stack to implement creation and destruction of cells in a conservative way.) Then the question might be: if we replace S, R, L by S', R', L' all rooted at the right subtree, what can we generate? * swap': r (a b) S' = r (b a) * rotate': r (a (b c)) L' = r ((a b) c) r ((a b) c) R' = r (a (b c)) For sure, the group these generate is isomorphic to the other one, but it's not a complete set as it leaves the whole left subtree invariant. Here 'r' really resembles the Forth return stack. The '>R' and 'R>' operations could be r (a b) -L-> (r a) b (r a) b -R-> r (a b) ( However, in the s-expression encoding of Forth I would use the version which builds stacks as right-leaning trees. ) Again the questions: 1) minimal set? 2) all permutations? Approach: My hunch is that it's not necessary to provide an infinite set of primitives: I think you can build level 2 from the connection between 0 and 1. Then when that works, do we have all? ... Antti[4] points out I'm looking for Thompson's group[2]. Follow through on [3]. [1] entry://../libprim/20100814-114746 [2] http://en.wikipedia.org/wiki/Thompson_groups [3] http://www.math.binghamton.edu/matt/thompson/cfp.pdf [4] http://ndirty.cute.fi/~karttu/matikka/stebrota.htm Entry: Compensating small room low-frequency acoustic modes Date: Mon Nov 22 11:34:52 EST 2010 Problem: I have a ringing room mode at 132Hz, right where I'm sitting. Mentioned by Ethan Winer[1] on Gearslutz[2] that it doesn't make much sense to compensate for narrow (high-Q, long rining) room modes with EQ. First: trying to EQ out room modes works only for a small sweeet spot. I can live with that. But as he mentions, it doesn't remove the ringing and it can't compensate for zeros. [1] http://www.ethanwiner.com/acoustics.html [2] http://www.gearslutz.com Entry: Pink noise Date: Tue Nov 23 18:04:42 EST 2010 Where does pink noise come from? The wikipedia page[1] says ``There are no simple mathematical models to create pink noise.'' In general power laws[2] indicate scale invariance. However the 1/f noise is inbetween white noise and its integral (Brownian noise). [1] http://en.wikipedia.org/wiki/Pink_noise [2] http://en.wikipedia.org/wiki/Scaling_law Entry: Impedance and Composite Circuits (What is a parallel impedance geometrically?) Date: Tue Nov 23 19:28:39 EST 2010 Type: tex I'm trying to revive some intuition for reading analog circuits and strikes me now in a very clear way is that the eye to use for electronics is not the eye for dimensionless signals but the eye for impedances. Dimensionless signals are nice to work with mathematically, and in most DSP work these numbers are all you ever see. The basic building block is the unit--delay, or the integrator in the continuous case. One input and one output, all very neat. However, the low--level basic building blocks of electronics are \emph{impedances}, relations between the voltage $V$ across and the current $I$ through an edge representing a primitive circuit element or a composite circuit. For any element, the impedance is defined as the ratio between voltage and current: $Z = V / I$. The conductance is the inverse $S = I / V$. \begin {itemize} \item There are only 3 linear circuit elements \begin {itemize} \item resistor $V = R I$ \item capacitor $dV = C I dt$ \item inductor $dI = L V dt$ \end {itemize} \item There are only 2 ways to compose them: \begin {itemize} \item parallel, summing $I$, $S$ \item series, summing $V$, $Z$ \end{itemize} \end{itemize} When it is understood that the currents and voltages in a circuit have a sinusoidal form, impedances can be represented by complex numbers, which are 2D vectors with added multiplication and division (inverse). This is because the primitive impedances behave as integrators or differentiators and thus leave the sinusoidal shape intact, changing only phase and amplitude. The interesting question is then: while investigating a circuit, is it simpler to switch between variables V and I when trying to understand circuits, or should one build an intution for the transformed summing relation (invert--sum--invert) $$A \| B = (A^{-1} + B^{-1})^{-1}$$ One way to try to go the second route is to see how this works geometrically. It's simplest to take one of the two values as a reference point, working with $C = B/A$. $$1 \| C = (1 + C^{-1})^{-1}$$ The inverse of a complex number $C$ is expressed by $\frac{\bar{C}}{\|C\|^2}$. I've tried these on paper: conjugate (flip around 1), invert (magnitude), add, conjugate, invert. This doesn't give me much understanding as the addition is still done in the other domain, so I move from impedance to conductance or vise versa. However the conjugates are not necessary as they cancel out so simple magnitude inversion is enough. The resultant in first approximation resembles most the smallest one and in second approximation the the resultant is inbetween the 2, and smaller than both. Entry: Quantization and number theory Date: Mon Dec 27 11:05:15 EST 2010 This subject keeps coming back really. Context: Digital control of analog synthesizers. This pops up in an electronics project I'm working on. The idea is to build a multi-channel DAC for control signals (say up to 200Hz), using a cheap microcontroller. The main design idea is that approximation error is OK, as long as it averages to 0 and it doesn't correlate with the signal or with itself. This means ordinary PWM doesn't really cut it as it has very strong periodic content. Dithered Sigma-Delta seems to be the proper tool. Next to digital control, I'd also like to explore pure digital combination of binary modulated signals using logic gates. See previous entries, i.e. [5]. The key idea here is to make sure there is no correlation. However some remark in [4], section "Multiplying 2 1-bit signals and getting a noise-shaped result" make me think I'm missing some key point when multiplying: high frequency modulation noise being modulated down. Recently I found a link to a seemingly interesting book [1] about Analytic Number Theory here [2], and a 2005 workshop[3][4][6] dealing with the subject and some more from Robert Adams[7]. The last one probably deserves a separate post[8]. [1] http://148.202.11.158/ebooks/mathbooks/Number%20theory/Analytic%20Number%20Theory%20-%20Newman%20D.J..pdf [2] http://rjlipton.wordpress.com/2010/12/26/unexpected-connections-in-mathematics/ [3] http://www.cscamm.umd.edu/programs/ocq05/ [4] http://www.cscamm.umd.edu/programs/ocq05/adams/adams_ocq05.htm [5] entry://20100218-104438 [6] http://www.cscamm.umd.edu/programs/ocq05/wolfe_ocq05.htm [7] http://www.netsoc.tcd.ie/~fastnet/cd_paper/ICASSP/ICASSP_2005/pdfs/0400077.pdf Entry: A Signal-processing interpretation of the Riemann Zeta Function Date: Mon Dec 27 11:33:06 EST 2010 Just a link[1]. Interesting stuff. Basic idea: a log delay network (LDN) is a system with delays at logarithm of the natural numbers. These networks have closed composition (LDN . LDN \in LDN). This is then used to construct an infinite series network which has supposedly provable stability, from which we can conclude that the zeros of the eta/zeta function have to be on the real line. So where is the flaw that provents this from being used as a proof of the Riemann Hypothesis? [1] http://www.netsoc.tcd.ie/~fastnet/cd_paper/ICASSP/ICASSP_2005/pdfs/0400077.pdf [2] http://www.cscamm.umd.edu/programs/ocq05/adams/adams_ocq05.htm [3] md5://31aab89af3f503238b57126a5d268639 Entry: Music: scales and intervals: almost equal composite numbers Date: Sun Jan 2 13:05:26 EST 2011 The fact that there are different tuning systems for the chromatic scale indicates that there is something quite wrong here. The chromatic scale is a happy accident because our brain glosses over the differences between intervals that are close but quite different. There are many examples. Take the minor 7th[1] for example. It corresponds to 16 / 9 = 1.777... 9 / 5 = 1.800... both harmonic intervals have nothing to do with each other, but melodically they are very close, about 1%. (A minor step is about 5%). It seems that music relies on this kind of gloss-over substitution when tones take on different relations in chords and melodies. I've always found this to be one of the weirdest things about the theory behind music. I've never seen it being mentioned so explicitly either, only in the context of tuning systems where it is mentioned as a nuisance. I think Bach even called it a Devine Joke or something (TODO: find quote). In short, it seems as if we can "teleport" between ratios that are related by near-1 fractions. I.e. for the minor 7th above we have a difference of 16 * 5 / 9^2 = 80 / 81 In general this phenomenon is recognized as a comma[3]. The ratio above is called the syntonic comma[2]. So one, how to enumerate the commas? Since the intervals in music all use limited prime ratios (i.e. up to 7), and limited amount of octaves, there is a limit to the amount of mistune one can get. And second, I wonder if this can be turned around. Given a chord written in "human" chromatic notation glossing over commas, is it possible to "distill" its different harmonic meanings by finding intervals that match in certain ways? I.e.: 16 / 9 = (4 / 3)^2 : two perfect fourths 9 / 5 = ... [1] http://en.wikipedia.org/wiki/Minor_seventh [2] http://en.wikipedia.org/wiki/Syntonic_comma [3] http://en.wikipedia.org/wiki/Comma_%28music%29 Entry: Zeta function Date: Sun Jan 2 18:10:24 EST 2011 Type: tex I've always found the multiplicative form of the zeta function $\zeta(s)$ quite intriguing, even if the explanation is straightforward\footnote{For the real proof: I forgot what the conditions are for rearranging terms in infinite sums. I believe it's allowed for absolute convergent series, which is the case there since all terms are positive. Note that the region of convergence is $\text{Re}(s) > 1$, so any statements made about this expansion and its multiplicative form are only valid for that particular region!}. With $N$ the naturals and $P$ the primes we have $$\zeta(s) = 1 + 2^{-s} + 3^{-s} + \ldots = \sum_{n \in N} n^{-s} = \prod_{p \in P} \frac{1}{1 - p^{-s}}.$$ This is because $$\frac{1}{1 - p^{-s}} = 1 + p^{-s} + p^{-2s} + \dots$$ and if you multiply them out, each natural number will be due to exactly one combination of the terms of the different factors of the product. This is an explicit way of saying that each natural number can be written in exactly one way as a product of prime powers. The interesting trick to note here is that operations on infinite sums are used to represent \emph{quantification} over a domain, i.e. for all integers. %[1] http://en.wikipedia.org/wiki/Riemann_zeta_function Entry: Encoding pairs of numbers Date: Sun Jan 2 18:32:00 EST 2011 Type: tex To encode a pair of two integers, construct a discreate space filling curve. Let's take a look at the single quadrant triangle constructed from successive antidiagonals. $$ \begin{array}{cccc} 0 & 1 & 3 & 6 \\ 2 & 4 & 7 \\ 5 & 8 & \\ 9 & \\ \end{array} $$ The top row are the partial sums of the the natural numbers: $n(n+1)/2$. If $(x,y)$ is the number to be encoded, you first find the diagonal is associated to by picking its top row element $(x+y,0)$, and then move down to the desired spot, so the encoding is $$n = y + \frac{1}{2}(x+y)(x+y+1).$$ Given $n$, the inverse can be computed by solving a quadratic equation $$\frac{1}{2}s(s+1) = n$$ for $s$ and rounding down to the nearest integer $s_r$. This is the first coordinate of the top row element of the diagonal the element is on. Whe then just need to move down the diagonal by by $y = n - s_r$, and so $x = s_r - y$. Entry: Encoding finite sequences of numbers Date: Sun Jan 2 18:58:23 EST 2011 Type: tex Primes can be used to to represent a finite sequence of integers of arbitrary length as a single number. For a sequence $x_1,\ldots,x_n$, construct the number $\prod_n p_n^{x_n}$ where $p_n$ is the $n$th prime number. The operation can be reversed by factoring the number. Is there a simpler way? Entry: The 7 divisions of the string. Date: Sun Jan 2 20:48:09 EST 2011 From Antti[1]. This is a simple algorithm for constructing musical interval ratios, starting with small numbers. We stop when the denominator corresponds to a regular polygon not constructable by ruler and compass. Now that sounds quite magical, but what does it mean? I think the non-constructable polygons (by ruler and compass) are the same as the n-th order polynomials factorizable by radicals. This problem is really about symmetry / discrete groups. The question is then: is this a coincidence, i.e. simply a question of where the brain decides ``well, this is enough beating to be called stable sound'' or does it really have something to do with the symmetry behind the non-constructable polygons. [1] http://ndirty.cute.fi/~karttu/Kepler/a086592.htm [2] http://ndirty.cute.fi/~karttu/Kepler/HarMun.txt Entry: Tim Stilson PhD Introduction Date: Tue Jan 4 04:55:22 EST 2011 Looks like he finished his PhD in 2006[1]. His work is mostly not about oversampling and non-linearities, but about non-oversampled techniques. ( Though I doubt BLIT and BLEP can be considered non-oversampled, as they are essentially oversampled lowpass. ) Anyways, in the section about discretization of continuous filters he mentions the Delta Operator on page 33, which also appeared in the book "Finite Difference Equations"[2] I read the first part of recently. The idea is to use an integrator/differentiator analogy to note discrete systems instead of the unit delay as this is more elegant and more closely resembles the analog case. This then gives the difference calculus, a set of rules to manipulate formulas with deltas, as exposed in [2]. The advantage of this encoding is that it isn't plagued by numerical issues for highly oversampled systems, i.e. Sigma-Delta modulators. There should also be a link to automatic differentiation somewhere... I.e. using memoized / butterfly-style networks instead of multiplied-out direct formula resembling "convoluted" bell/binomial shaped coefficients as Tim mentions. I need to explore other references too.. Goldmine. Dana Massie: EQ, Harvey Thornburg: NL moog. Antti Huovilainen: moog circuit model. [1] https://ccrma.stanford.edu/~stilti/papers/Welcome.html [2] isbn://0486672603 Entry: Difference Calculus Date: Tue Jan 4 11:09:54 EST 2011 Following up on the references from Tim Stilson's PhD[2] thesis. Already have the book "Finite Difference Equations"[1] from 1961 by Levy and Lessman. ( I found the original in a used book store without ISBN, [1] is the Dover reissue. ) The references Tim mentions are [3][4][5][6][7][8] (books and paywalled papers). I found one accessible paper on the subject[9]. For x[k] a real valued sequence |N - > |R, we define the operator z: z x[k] = x[k+1] Then we can define d = z - 1 or d x[k] = x[k+1] - x[k] These are normalized equations with time step equal to 1. Naively relating d to the s-plane requires s/f_s normalization. Before getting into cargo cult DSP math, one needs to keep in mind that: * The d and z transforms of a discrete filter contain the same information: they are merely described in terms of different primitives: the unit integrator (accumulator) vs. the unit delay. * In the limit case of infinite sampling rate filter coefficients of d-plane and s-plane coincide. However, naively converting between the discrete and continuous case for finite sample rates is still wrong! In fact, naive conversion is the forward difference transform wich maps s -> z - 1, and which has known stability issues for coefficients away from z = 1. More general, all the caveats concerning traditional filter discretizations still apply. One can derive reformulated versions of the well known transforms: forward/backward difference, bilinear, impulse response invariance (pole mapping) and pole-zero mapping. These then look like naive (f_s scaled) direct s -> d mappings + some correction terms O(s / f_s). The main use seems to be one of implementation: coefficient sensitivity will be less in d-plane formulation for low-frequency poles and zeros. Of course one can start by simply substituting z = d + 1 in a fully expanded z-plane formulation, but that is not very insightful. Instead we do the following: * Factor the z-plane formulation first as a product of first order terms (z-z_k), including complementary complex ones. Performing the substitution in this form yields terms (d + 1 - z_k). We name d_k = 1 - z_k. For low-frequency phenomena it happens that the z_k are all close to one meaning the d_k are small in magnitude. So essentially, a formulation in d-plane form eliminates a subtraction that is the cause of loss of precision in the z-plane formulation. * 1st order IIR sections in the form of b/(d-a) can be implemented directly using leaky integrators[11]. A state variable 2nd order section can be used to implement the complex conjugate 1st order sections. For IIR filter design (when everything is analytic) there is little practical difference between z-plane and d-plane approach: simply transform (analytically) when you're done. For FIR filter design which is mostly numerical in nature, the d-plane approach seems also useful for avoiding numerical instabilities. I.e. to formulate the optimization problem in such a way that the FIR filter appears in factored form, instead of multiplied out. ( From my own experience I've found that FIR filter design problems involving low frequencies are seriously plagued by convergence issues due to insufficient numerical precision. I recall that using a spectral factorization method to design an FFT window filter I would obtain results that where clearly wrong after visual inspection, i.e. one coefficient value "sticking out" while there should be overall symmetry. It would surprise me if this hasn't been realized long before. Look into ladder FIR filters. ) [1] isbn://0486672603 [2] https://ccrma.stanford.edu/~stilti/papers/Welcome.html [3] http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=1104162 [4] http://md1.csa.com/partners/viewrecord.php?requester=gs&collection=TRD&recid=2270973CI&q=&uid=790228084&setcookie=yes [5] isbn://9780817639341 [6] http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=0123294 [7] http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=73564 [8] http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=219661 [9] http://sites.google.com/site/jamesgibsonhomepage/projects/DeltaOperatorCaseStudy.pdf [11] entry://20110104-124734 Entry: Practical d-plane formulations Date: Tue Jan 4 12:47:34 EST 2011 As a continuation of [1]. Given a d-plane formulation, how to implement it? It's simplest to start from the leaky discrete integrator and see what it's d-plane formulation looks like. The time domain formulation y[k+1] - y[k] = a y[k] + b x[k] y[k+1] += ... expresses the update of y in terms of the weighted magnitude of y and x. It corresponds to the frequency domain formulation d y = a y + b x (d - a) y = b x y / x = b / d - a The non-leaky integrator is then b / d. Complex poles are handled in exaclty the same way: We take the complex form of one of the complementary pairs, feed it a real input and taking only the real part of the integrator as as output. TODO: - Make the 2nd order implementation explicit. Handwaving might be wrong. - What about d-zeros? [1] entry://20110104-110954 Entry: Lattice and Ladder filters Date: Tue Jan 4 14:02:24 EST 2011 Let's get some terminology ironed out. Mostly in relation to [2][3]. Structurally, a d-plane formulation can be associated to a ladder/lattice topology, and their good numeric properties are well known. They also have interesting mathematical properties (see linear prediction and orthogonal functions on the unit circle [10], and further work in generalized Schur and displacement rank algorithms for Toeplitz-structured matrix problems.). [1] http://en.wikipedia.org/wiki/Lattice_filter [2] entry://20110104-110954 [3] entry://20110104-124734 [10] http://www.emis.de/journals/BBMS/Bulletin/bul941/BULTHEEL.PDF Entry: Nonlinear digital Moog VCF + bandlimited sawtooth. Date: Tue Jan 4 15:31:08 EST 2011 Also mentioned in Stilson PhD[2] is Antti Huovilainen: Nonlinear digital implementation of the Moog ladder filter[1][3] ([5][4]). Interesting derivation, but not such a surprising result. However, the piecewize parabola is a nice trick! So I wonder if this can be taken further one order by representing the remaining "bounce" discontinuity by a 2nd order "brake" discontinuity and differentiating twice, i.e. by using piecewize 3rd order polynomials: (x-1) x (x+1) Deriving once gives the piecewize parabola, deriving again gives the sawtooth. The 3rd order polynomial's discontinuity is smoother so should roll off faster (18dB/octave). It shouldn't be too hard to implement in Pd either.. Indeed: differentiated parabolic (DPW) [6] twice differentiated cubic (2DCW) [7] Looks like the obvious extensions are already published[8]. V. Välimäki, J. Nam, J. O. Smith, and J. S. Abel, “Aliassuppressed oscillators based on differentiated polynomial waveforms,” IEEE Trans. Audio, Speech, Language Processing, vol. 18, no. 4, pp. 786–798, May 2010. [1] http://dafx04.na.infn.it/WebProc/Proc/P_061.pdf [2] https://ccrma.stanford.edu/~stilti/papers/Welcome.html [3] http://www.mitpressjournals.org/doi/abs/10.1162/comj.2006.30.2.19 [4] md5://81ef26b98b858bfc7ea351850b7f8872 [5] md5://ec26bdee832237793c6de78875942a60 [6] http://zwizwa.be/darcs/pd/abstractions/saw2~.pd [7] http://zwizwa.be/darcs/pd/abstractions/saw3~.pd [8] http://ieeexplore.ieee.org/Xplore/login.jsp?url=http%3A%2F%2Fieeexplore.ieee.org%2Fiel5%2F10376%2F5446581%2F05153306.pdf%3Farnumber%3D5153306&authDecision=-203 Entry: 2nd order 4-multiply ladder EQ Date: Wed Jan 5 07:06:37 EST 2011 Also mentioned in Stilson PhD[2] is Dana Massie's work on 2nd order ladder EQ with separate frequency and Q controls[1]. [1] http://www.aes.org/e-lib/browse.cfm?elib=6994 [2] https://ccrma.stanford.edu/~stilti/papers/Welcome.html Entry: Model based control Date: Thu Jan 6 13:06:47 EST 2011 I think I just re-invented MBC: - Given a (non-linear) fixed model of the control->actuation transfer, perform feedforward or feedback control updates at high frequency. - Update model parameters at lower frequencies. Entry: Delta Operator Date: Thu Jan 6 14:28:24 EST 2011 Called divided difference as it's defined as: d x[k] = x[k+1] - x[k] / T I.e. the differences are taken relative to the sampling period T. In most cases where T is fixed this can be normalized to T=1 by moving the scaling to the coefficient end, i.e. coefficients will become small when T approaches 0. However, in the exposition of [1] explicit mention of T is desired to be able to take the limiting case of T->0 to make the bridge to the continuous case, and the cases with different T. See also fixed step integration[2], Time Scale Calculus[3]. [1] md5://3e41deebd80a5b988de09b645c036e57 [2] http://en.wikipedia.org/wiki/Eulers_method [3] http://en.wikipedia.org/wiki/Time_scale_calculus Entry: Matrix multiplication is quadratic? Date: Mon Jan 10 00:10:48 EST 2011 Basic idea: Matrix multiplication can be related to group multiplication, which can then be implemented by the fourier transform. This is freaky stuff. I think this paper applies to finite field matrices though. Does this also work for the reals, or in approximation for floating point numbers? Does it mean that in general, _all_ matrix multiplication problems can be expressed as convolution problems? I.e. the generalized Levinson / Schur style algorithms already exploit this in some sense.. Are all matrices "structured" ? [1] http://arxiv.org/abs/math.GR/0511460 Entry: Dithering and "FAT" saw stacks Date: Mon Jan 10 11:43:47 EST 2011 The idea: a stack of sawtooth oscillators sounds "FAT" when it's from an analog synth. I don't have one on hand here, but is this due to subtle frequency changes? Using the parabolic/cubic sawtooth generator in Pd I get a rich sound, but the very repetitive intermodulation is a bit disturbing. ( Problem however, I do not know if I'm just paying attention to something I wouldn't normally notice. Unfortunately I do not have an analog sawtooth oscillator handy. ) However, the idea in general is quite interesting. This is related to: - dithering in sigma-delta - peak limiting in spread-spectrum pulses and ADSL[1] - eliminating room resonance modes in acoustics (flattening) - same for artificial reverberators (i.e. FDN) Ultimately this is about integers, i.e. "almost equal" numbers. But maybe that's not what I'm looking for. Could this be an issue of coupling and chaos? I.e. if multiple saws go thorugh a discontinuity close together, could there be some secondary cause for them to spread apart in frequency? I.e. modulations in a string do just that: peaks increase the tension and thus temporarily disturb the linear order. [1] md5://9b2913693c005de2515e3efeb8548e47 Entry: Z-transform and generating functions Date: Tue Jan 11 15:58:12 EST 2011 Who chose that damn negative exponent in the Z-transform[1]? For expressing difference and update equations in state space form it is simpler to have "z" mean _index increment_ or feedforward in time, and "d" defined as d = z - 1. This makes update equations and difference equations directly expressible: x_{k+1} = a x_k => z x = a x x_{k+1} - x_k = b x_k => d x = b x In state space form, the relation between the z and d formulation is also pretty clear. For systems where the dynamics is slow compared to the sampling rate, the state space formulation in difference form also makes more sense: it has better numerical stability and corresponds to fixed-step[3] integration of a continuous differential equation. Flipping the exponent around makes the Z-transform equal to the relation that expresses a generating function[2]. The inverse Z-transform then corresponds to the line integral expression for power series coefficients in the expansion of an analytical function around z=0. So, picking a non-(engineering-)standard notation means we need to take care when just copying expressions out of books and papers. However from context it is usually quite clear which is which. The most important differences for the positive-exponent Z-transform are: * Stability: convergence of a power series of an analytical function (representing a signal or transfer function) works for |z| < R where R is the location of the pole closest to 0. * Stable transfer functions need to have all poles _outside_ the unit circle. This is just the reverse of the flipped-sign Z-transform. Of course, a similar point could be made for the Laplace transform (LT). Why does it have a negative exponent? Same for the Fourier transform (FT). For the FT it seems to make sense to me, as the reconstruction is in terms of positive exponents, which somehow makes a more intuitive appeal. Anyways, it really doesn't matter that much -- just convention and something to nag about... [1] http://en.wikipedia.org/wiki/Z_transform [2] http://en.wikipedia.org/wiki/Generating_function [3] http://en.wikipedia.org/wiki/Euler_method Entry: Factored FIR Date: Fri Jan 14 15:44:14 EST 2011 Q: Are there any FIR filter design algorithms that explicitly rely on the equation to be in factored-out form, i.e. as a product of 2nd order sections? I suppose this introduces the problem of having to specify whether FIR zeros are complex or not. I seem to remember them mostly being complex, exept for odd orders where there would be one zero. Q: Is there a reason why FIR filters have "mostly complex" zeros? Is this equivalent to asking why their inverses the IIR all-pole filters have "mostly complex" poles? (Because that gives more locality.) Entry: Initial Algebra / Final Algebra Date: Fri Jan 21 12:01:08 EST 2011 This terminology is used in [1]. The concrete interpretation for me is that "final" means modeling in terms of functions while "initial" means modeling in terms of abstract data types. Other than that the terminology is obscure to me. Googling brings me to this[2]. So it's in the field of semantics. [1] http://www.cs.rutgers.edu/~ccshan/tagless/jfp.pdf [2] http://homepages.feis.herts.ac.uk/~comqejb/algspec/node12.html Entry: Squirrels & robot control Date: Sun Feb 20 12:38:52 EST 2011 Squirrels use very fast balancing corrections. Why is this? Some possibly relevant issues: - Fast corrections (high acceleration) are relatively independent of gravity. I.e. this is a "pulse compensation" as compared to a continuous control process. - They have litte mass m, which means that they don't need much force F to get to high acceleration a = F/m. - Fast pulse compensation is also present in eye movements. Apparently for the eye, tracking is more difficult than focusing on the same point for a longer period of time. Entry: Fuzzy class D multi channel amp Date: Wed Mar 2 12:33:07 EST 2011 Problem 1: I want a multi-output class D amp for synth control signals. BW = 200Hz is a must, anything higher is welcome. Problem 2: I want to use a cheap, off-the-shelf part (a PIC uC). In general, you'd want to solve the problem in hardware (i.e. PLC or FPGA) for one channel independently, and then duplicate it n times. However, the rates involved in my problem are quite low, so it's probably possible to use a cheaper uC-based solution. How to go about making that optimal? To do it on a PIC requires some cleverness, as flipping multiple bits at the exact calculated time will not work well because the PIC needs to do the necessary computations serially. Taking a strict time multplexing approach is probably not optimal, as in most cases no action needs to be taken, i.e. no toggle performed, so we might as well not have done the computation in the first place. So I wonder how to go about trying not to perform unnecessary work and using a more optimal multiplexing scheme. I.e. something like: - find a fast way to guess the next output that needs a toggle - verify the guess by performing a computation - perform the toggle (or not!) - record the actual time of the toggle so it can be incorporated in the next control computation The hard problem seems to be the first one: how to build a system that knows which current error is the largest, so requires immediate toggle action? If the guesses are accurate, there won't be many computations that are for nothing, i.e. that decide to not toggle because it's too soon. I'd say this is a data structure problem. Find a way to keep a sorted list of current errors. Then simply pick the top one to update. Entry: Sound impedance Date: Tue Mar 15 23:49:18 EDT 2011 If I get this right: speakers are cone shaped. At the narrow end, velocity is high but pressure is low. At the wide end it's reversed. Widening enables the same amount of mass to move with less velocity. WRONG! According to [1] it's just the other way around! I think it's time for me to pick up Feynman's lectures again.. [1] http://en.wikipedia.org/wiki/Horn_loudspeaker Entry: I don't know sound: 1. Pressure Date: Wed Mar 16 00:04:01 EDT 2011 I think I don't understand pressure. It's a statistical measure that represents force per unit area in a collection of moving point-masses. Note that it is really the force that is averaged: Take a very small piece of area such that it gets hit infrequently. The force necessary to keep it in place has a pulse-like nature due to collisions: force is high when we're "pushing back" on an incoming molecule, and zero when there is no contact. Therefore it is simpler to go up and down the integral stairs once: don't look at force but at impulse transfer per collision (integrate) and then look at the number of collisions per time unit (differentiate). From this perspective, why is it that a speaker driver (i.e. a flat surface moving in a direction perpendicular to its surface plane) has high impedance, meaning it produces mostly pressure and no displacement? Entry: Power spectrum as power of derivative of image Date: Thu May 5 08:36:15 EDT 2011 See [1] -> Listening to your webcam. I was wondering, since there is such a "naturally pleasing" map between the derivative of an intensity map and an audio spectrum, are there more such parallels to make? [1] http://homepages.kcbbs.gen.nz/tonyg/projects/tangents.html Entry: Queued Class-D DAC Date: Sun May 8 08:57:30 EDT 2011 Type: tex \def\us{\mu\text{s}} The objective is to use a cheap microcontroller to implement an $N$-channel pulse--modulated DAC, as opposed to using parallel programmable hardware, duplicating a $1$-channel DAC $N$ times. The following is a list of remarks about how to implement this functionality on a cheap microcontroller. \begin{itemize} \item The software switching performance of a typical uC like a PIC18F is about $10$ instructions or $1\us$, where each time slot reads and updates a state register and an output pin. \item The DAC is going to be used to modulate amplitude and frequency of audio signals. This places an extra constraint on the conversion error: its decorrelation is favoured over its inaudibility. \item A straightforward approach is to multiplex such time slots on a rigid grid, obtaining a switching period of $NT$ where $N$ is the number of channels. \item Depending on the modulation algorithm used, it might be possible to optimize this scheme. Instead of computing each DAC's state update at each tick period, it might be possible to just compute the time to next switch, and then queue these operations. If the scheduling problem can be solved exactly, this can bring the effective resolution from $NT$ in the dumb multiplexed case to $T$ in the queued case. \item If the scheduling problem can not be solved exactly, it might be possible to measure the switching error and feed it back into the state update. Some global strategy is probably necessary to make sure these errors are distributed over several DACs, and sufficiencly decorrelated in the case of signals that modulate audio. \item It might be possible to favour the $50\%$ duty cycle, and only encode differences with respect to this default. I.e. the signal $101011010$ would be represented by $000001111$ after demodulation by the default $2T$ square wave $10101010$. It would be interesting to find out if such an approach is beneficial for eliminating correlated noise ripples, or if it just shifts the problem. \end{itemize} Entry: Code generation and quasi-particles Date: Sun Jun 12 00:04:03 CEST 2011 (mumbo-jumbo) I've had it with this borking real-world stuff. Time for some seriously wacky ideas. What I'm trying to do in libprim is to make a compiler and VM at the same time. In practice, glossing over this means to take the simplest representation of that structure (an interpreter), i.e. represented as the composition A B C, and inserting "virtual particles" (which are operations and their inverses) at the boundaries of the VM and compiler. I.e. A B x x' C, where A B x is the VM and x' C is the compiler. That's the high level mumbo jumbo. Now, can this be done in practice? How to express code specialization in an explicit algebraic way, i.e. as symmetry transforms? From afar the theory seems sound: you take an element of a group which represents a function, and you find a representation of that element such that it can be split in two: a compile time and a run time element. Then use your understanding of the symmetry of that group to make automatic search a little bit feasible.. Entry: Programs and specifications Date: Sun Jun 12 15:22:28 CEST 2011 (mumbo-jumbo) The trouble is that representation needs to be as abstract as possible. This seems to be the hard part as very often it remains implicit. I.e. programmer thinks "Ah, let's use this bit for that to quickly hack around doing these things separately". That's exactly the implementation information that one wants to capture, and has an incredibly large amount of detail if made explicit. This then also indicates why this process is not factored in the following particular way: specification S . specializer (S->I) = implementation (I) but by writing the implementation I separately, and providing a proof that I embeds S. Q: What is more difficult, to provide an indirect proof that S is isomorphic to a subpart of I, or to construct a direct (parameterizable?) embedding function S->I? Putting it like that it seems that the former is simpler, as it allows the use of non-constructive proof techniques. However, making the S->I embedder parameterizable might give a whole family of implementations that do not need separate proofs. Entry: A Primer of Infinitesimal Analysis Date: Thu Jul 7 23:28:09 CEST 2011 A nice one to stack next to dual numbers[3] and automatic differentiation. Link from [2], which is quite an interesting post in itself. ``Smooth space-time, say `R^4`, allows only smooth motion and smooth distribution of mass. If we place non-smooth mass in it, the space will change to a _subset_ of `R^4` which carries additional information about the anomalies contained in it.'' I started reading [1][5] and I must say it's quite interesting to reason with non-non-existing entities ;) In the book they are likened to virtual particles: something that's not really present in end results, but useful as a tool in calculations. [1] isbn://0521624010 [2] http://math.andrej.com/2008/08/13/intuitionistic-mathematics-for-physics/ [3] http://en.wikipedia.org/wiki/Dual_numbers [4] http://en.wikipedia.org/wiki/Smooth_infinitesimal_analysis [5] md5://3f3da3bbc8da4d430677acc78cb9c7c3 Entry: Getting that book feel on an e-reader Date: Tue Jul 12 22:00:30 CEST 2011 If the font is too thin, e-readers or screens don't give a very good feel. I.e. math books and papers in computer modern are really horrible. I'd like to emulate the feel of blurry ink on paper. This seems to work well with a 600dpi PDF with radius 6 gaussian blur (gimp's units) and a exponential looking J-shaped intensity curve. Before I tried this with a pixel-growin algorithm too which gives similar results[1]. [1] entry://20090511-181107 Entry: Human motion Date: Sun Oct 30 17:44:29 EDT 2011 Plot derivative of a moving laser dot to see the small "pulse corrections" and make a model for that. Entry: Time varying filters Date: Mon Oct 31 00:16:32 EDT 2011 What makes musical filters musical is their time-variant nature, their speech or phrasing. Filters by themselves are quite cheap, though controlling them, making them time variant to make an interesting texture is not at all a simple problem. It might be interesting to go into that problem a bit now that I (almost) have my code generation machinery running.. Entry: Logic Date: Sun Dec 18 16:42:08 EST 2011 For my EE-brain, "logic" is the logic of electronic circuits, which corresponds to propositional logic. The "logic" of type systems is at least first order logic. Quantification (things with holes) is necessary to express structure. Entry: Inference Date: Fri Dec 23 12:53:47 EST 2011 Is inference always generate-and-test? More specifically, I'm thinking of the task "design somthing that has these properties.". In general, there is no algorithm that goes from a description of the properties to the description of something that implements those properties (a logic and a model?). I'm inclined to think (in a vague an unrooted way) that any form of "algorithmic inference" is really just fixing parameters an already existing / well-defined family of models, not *ever* the construction of a model out of the blue. Why is this important? Compilation of specifications. Is it so that a model can be seen as a "constructive proof" of a set of properties? Is this kind of thinking formalized in some way? [1] http://en.wikipedia.org/wiki/Model_theory Entry: Distributivity vs. Commutation Date: Thu Dec 29 08:50:12 EST 2011 I'm still in the process of finding a good terminology for "type commutation" in Haskell, i.e. correspondences like this: a (b t) <-> b (a t) a (b t1) (b t2) <-> b (a t1 t2) and multi-arity variants of this, which seem to be more arbitrary due to the different ways the t_x can be permuted. Entry: Density of discrete entities Date: Sun Jan 1 17:04:12 EST 2012 There's something funny going on with the concept of density for a finite amount of particles. How is the "blurring" regularization for this to work usually handled? I guess that moving to a statistical interpretation solves this problem. The density is not a property of the particles, but one of the probability distrubution that is used as a model of the particles. I.e. model 10 particles by thinking of them as a (random) a sampling from a PDF, then talk about the PDF instead. Entry: Accounting Date: Mon Feb 13 13:27:37 EST 2012 How to use gnucash. Problem: I currently don't care about "proper" accounting (i.e. the accounting equation [1]). I just want a way to manage order reports (i.e Amazon, Ebay, ...) that can be obtained electronically, to categories of expenses (Office, Equipment, Transport, ...). I'm not sure if that makes any sense in a standard accounting approach. It's probably good to first learn how to do this properly. [1] http://en.wikipedia.org/wiki/Accounting_equation Entry: Mixers and integrators Date: Tue Jul 3 08:41:27 EDT 2012 Maybe it's time to try the Sigma/Delta synth again, using two building blocks: - the integrator = counter + S/D converter - the mixer = "random" bit stream multiplexer Where "random" means the switch waveform is not corrlated with the S/D "waveform". It would probably be enough to make it into a cheap LFSR. Before doing this on chip, it's probably a good idea to simulate it first in Haskell. After that it might be interesting to put it in an FPGA, or a low-frequency version in a PIC. Theoretically, it would be interesting to find a measure of how inefficient this actually is. The main tradeof is to use a use spectrum so it has an 1/f characteristic, and simplify the "mixer" to a multiplexer (assuming RNG is free). Why is an S/D dac/adc beneficial? It seems that it trades off speed vs. size, but not in a very efficient way. However, "speed is free". HW runs in MHz/GHz range, about 1000x more than ordinary audio sampling frequencies. Apart from time-multiplexing (i.e. sequential computer programs) there is no way to easily use these extra cycles. One thing I don't really understand is how to look at "information". It's said that the higher bands contain noise, but that's not really true.. it's because of the "noise" in the higher bands that the information in the lower bands can fit in the 1-bit dynamic range. It doesn't seem that these higher bands can actually contain information along side the low band signal. However, the encoding is very robust against random bit errors: they definitely introduce unwanted signals, but they do it in a gracefully degrading way. It would be interesting to compute the theoretical capacity[4] of a channel with the same noise envelope as a s/d and see how it relates to the digital bit rate. Not to forget, for signals close to the 0/1 voltage, a S/D acts as a voltage <-> frequency converter. Closer to 1/2 voltage this is the same, only modulated near to 1/2 the output frequency. In this mode is is clear that DC accuracy is actually infinite (infinite integration time) so to talk about "dynamic range" is always dependent on frequency. [1] entry://20090609-145519 [2] entry://20101227-110515 [3] entry://20100218-104438 [4] http://en.wikipedia.org/wiki/Channel_capacity Entry: Sigma/Delta retake Date: Wed Jul 4 13:44:39 EDT 2012 Let's start from scratch. Basically, there are two systems I'm interested in: the first order analog lowpass or integrator, and the digital integrator (i.e. the PIC carry flag hack). Some basic assumptions: - Reconstruction should be a simple (maximally flat) lowpass filter, meaning that the digital signal has a direct analog meaning. - Feedback and reconstruction filter are the same. I.e. the error signal is the effective error. - First iteration: work with 1st order LP to simplify math, this leaves one design parameter: filter cutoff. - Quantisation noise and signal are not correlated (which is wrong in practice!). - Negative feedback minimizes error with white spectral distribution (if signal and noise are not correlated). Let's get at the basic schematics. We're interested in the error of the reconstruction which is present after the negative feedback summation [i]. First iteration I was thinking to minimise the reconstruction error, which is o = (s - Fr). s-->--[-]-e-->-[Q]--->--r \--o--<-[F]-/ q | s-->--[-]-e-->-[+]--->--r \--o--<-[F]-/ where F is a lowpass filter, s is the input signal, r is the binary representation, e is the feedback error signal to be minimized, and o is the reconstructed output. The quantizer Q is modeled as a noise source q. However, this is not the same as the circuit I find here[1]. s-->--[-]-e-[F]->-[Q]--->--r \--------<----/ which is equivalent to s->-[F]--[-]-e->-[Q]--->--r \----<-[F]-/ This means that we're minimizing the error between the reconstruction (filtered representation) Fr and the filtered input Fs. This makes perfect sense: we can't reconstruct the high band of s, as that's where we're going to move the noise. In practice however, if F is already bandlimited such that s = Fs, the first schematic will also work. To see the shape of the noise component e in r, divide out F in the error equation: Fs - Fr = e s - r = e/F = e' since e is white, e' has the shape of 1/F. EDIT: The fist schematic is a delta modulator, and [1] uses integrators instead of lowpass filters. Explain that. [1] http://en.wikipedia.org/wiki/Delta-sigma_modulation Entry: Pulse counters as grey code? Date: Wed Jul 4 15:11:21 EDT 2012 Can pulse counters for implementing integrators be implemented as grey code counters? Does it even make sense at gate level? Entry: Usable spectrum in a Sigma/Delta signal Date: Wed Jul 4 15:42:53 EDT 2012 How does this[1] translate to the SNR of s? We have the shape of the noise, but not the maximal amplitude. I'm interested in SNR to compute the channel capacity, i.e. how much information can be encoded in the Fs signal as opposed to the raw bitstream r. Is there any capacity lost by this encoding? It seems that this is not an easy answer, as it involves assumptions to make the nonlinearity go away. What about this: - in r = s + e', the signal component s is negligible. - e' is highpass, which means it has no DC component, so the instantaneous energy is known, and constant: half of the bits are 1, half are 0. Setting 1/2 in the middle, the RMS is 0.5, power is 0.5^2. - given the shape of e' and the the total energy, the frequency-dependent energy can be computed. So it seems that as long as s doesn't have a DC component, it is straightforward to compute the absolute noise envelope, which is constant as long as s remains small wrt. e' If s does have a DC component, the linearization doesn't work. The more DC there is, the less "room" there is for the AC component. A better assumption would be to say that s is highpass (we still need that assumption to compute the total energy) but that it's cutoff is much lower than that of e'. Here the total energy (still 50% duty cycle) is distributed over s and e. As long as the cutoff of e' is much lower than nyquist, it seems that the first assumption (setting r ~ e') is sound. The 2nd assumption can still be used to make sure that s doesn't exceed the highest point of e'. What is surprising here is that the presence of a DC component shifts the dynamic range. Actually, the same happens in other amplitude- limited channels. So.. Given the assumptions above the instantaneous of the high pass signal (0.5^2) would approximate a white signal, meaning that the maximum of the PDF is 0.5^2/fs. This seems to be enough to approximate the signal capacity, using the part of e' that's below the maximum, keeping in mind that the DC part is not usable because the linearized channel properties depend on the DC component, but negligible for computing channel capacity. ( It would be interesting to express all those approximations exactly.. The exact formula is probably fairly complex. ) So, to top off all the approximations, let's say that the bandwidth is reduced by x, the oversampling factor, which leads to an increase of x in amplitude dynamic range. Plugging this into Shannon's formula[2] directly gives the asymptotic behaviour in terms of x C = B log (P / N) ~ B_0/x log ( P_0 * x^2 / N) which clearly shows that this is a fairly expensive technique when looking just at the information content of the channel. The reduction in capacity is log x ----- x (where we ignore the constant power x^2) which corresponds to the intuition that we're using bits to represent individual events E_i, and not "sums of events" which only needs bits in the order of log (sum E_i). This high redundancy makes it plausible to believe that the effect of bit errors is minor. It seems to indicate that the "shape" of the signal is largely irrelevant, so we can probably use that to our advantage (decorrelation to allow computation with such signals). So... say x = 100000 which is 5 decades, which is 100dB dynamic range. The information cost of this is about a factor 6000, meaning that only 1/6000 of the information is actually useful. That's a lot of room to put some extra stuff!. Think of this: to change a bit from 0<->1 adds/subtracts energy that can be seen in the base band (impulse response of the reconstruction filter) but to switch the position of 2 adjacent complementary bits has almost no effect in the base band as this blip has no DC component. As a result it is probably possible to add random permutations to the output bits without this being noticed. Anyways, this also makes it clear why it's probably best to use some steeper filters as they bring the noise floor down. With o the order of the filter this becomes: o log x ------- x NEXT: Revisit the logic operations on (non-correlated) S/D signals + check HF noise modulated into base-band. [1] entry://20120704-134439 [2] http://en.wikipedia.org/wiki/Channel_capacity Entry: Haskell & simulation Date: Wed Jul 4 20:28:38 EDT 2012 It would be a good opportunity to put the S/D modeling in Haskell first. The digital part is straightforward. The missing component are integrators. Differentials can be computed using autodiff. Once this works, deriving a generator for an implementation should be straightforward. Entry: Basic Hypothesis Testing Date: Sun Jul 8 15:39:34 EDT 2012 1. Formulate a hypothesis 2. Test if it agrees with the data Questions: What "form" does a hypothesis have? A function? A relation? How to test? What form does a test have? Probability? Entry: Understanding Density Date: Tue Jul 10 12:34:06 EDT 2012 Some things have popped up lately that are all related to the problem of density: a bunch of discrete things interact in an intractable way, but when this exact representation is integrated or averaged it can represent a very nice model. Usually when 1. the granularity is fine enough and 2. the entropy is high enough, cross terms drop out and there is a (information-reducing) "morphism" between operations in the discrete domain and operations in the density domain. I wonder, how to qualify "enough" for the properties of granularity and entropy. Some concrete examples: - The use of money in the form of discrete transactions in time. To map this to a density model (money stream) makes it easier to think about. - Sigma/Delta modulators: all operations are discrete both in time and space: fixed sampling frequency and 0/1 bit levels, but the part of the signal we're interested in is a forgetful time integral = low pass filter. - Sound is a phenomenon that can be modeled as fluctuations of densities of particle positions and velocities. All density representations seem to be based on some form of smoothing = convolution of a discrete distribution with a smooth, local kernel like, e.g. a bell curve. Entry: Density of monetary transactions Date: Tue Jul 10 12:46:01 EDT 2012 See last post. Compared to Sigma/Delta and sound waves, monetary transactions seem to be different. Why is this? The problem is that simple smoothing doesn't seem to do the trick. It appears that some transactions have an intrinsic "bulkiness". I.e. a $1000 transaction that happens once a month (e.g. rent payment) is different in kind than a $10 transaction that happens every day (e.g. a meal at a fast food restaurant). Modeling money streams using a density model seems to need to take this into account: how much time does a transaction cover? I.e. we want to recover the "processing" that is done to the transaction. Conclusion: monetary transactios are not "natural" in the sense that they miss information, i.e. the intrinsic periodicity of a transaction. Is there a good way to attribute a time scale to transactions that is meaningful? The goal for (home) finance analysis is to distinguish recurring costs (taxes, rent, food) from one-time costs. Entry: Analog Synth using Sigma/Delta signals Date: Tue Jul 10 15:40:09 EDT 2012 So the question is: do the calculations on expected values in in [1] actually make sense? Due to modulation, part of the spectrum will probably end up in the base band, but maybe this doesn't show up in the expected value because the average is invisible? It's probably best to code up a couple of examples and compute spectra. However, it still seems that if the noise is not correlated, the result should be 0 at DC. With C = A x B, the spectra are _ C(f) = _/ A (f-x) B(x) dx It really seems that the trick is non-correlated noise. In the following _ C(f) = _/ [ A(f-x) + A_n(f-x) ] * [ B(x) + B_n(x) ] dx the convolution of the noise terms should be zero. _ 0 = _/ A_n(f-x) B_n(x) dx If the bands of A(f) and B(f) are small, the noise band in C(f) will only grow a little, but it will grow: any non-DC signal A(x) | B(x) smears out the noise B_n(x) | A_n(x) a little. So it seems that [1] is sound, given that there is no correlation. The real question is then: how to decorrelate S/D streams? [1] entry://20090609-111303 Entry: Bitswapper Date: Tue Jul 10 16:26:28 EDT 2012 Bitstream i in, bitstream o out, such that i _|_ o but /i = /o. Someting like this: circular shift register, read from a random position n, then send out that bit and read into position n. Repeat. This guarantees there is no average energy lost, but the result is probably not very correlated with the input. Implementation: barrel shifter with log_n random input bits, or an addressable memory. There has to be a relation between the size of the store and the quality of decorrelation. How does this work? 1-bit: no decorrelation 2-bit: either pass or bump Note that while this might reduce correlation (sum of element-wise product) it does not mean streams are statistically independent, because they are related through the property of average. Even if this doesn't work for anything hi-fi, there must be some pretty interesting effects hidden here. Convert a PCM signal to S/D, then apply this bit algo. Such a thing is not so hard to build, so let's try a Pd external. What are the statistical properties of such a decorrelator? It's probabl simplest to start with a couple of concrete examples and work from there. 2 bit ----- 50% D=1 input 50% state 50% D=2 input 50% state ... The output bit is: 1/2 x[1] 1/4 x[2] 1/8 x[3] ... This looks suspiciously much like a lowpass filter. On average, the effect is a filtering with h[t] = 2^-t for t>0 ad h[t] = 0 otherwise. 3 bit ----- 1/3 D=1 2/3 state 1/3 D=2 2/3 state 1/3 D=3 2/3 state The output bit is: 1/3 x[1] 2/9 x[2] 4/27 x[3] or 1/2 (2/3)^n = x[n] Looks lke this generalizes.. 4 bit ----- 1/4 D=1 3/4 state 1/4 D=2 3/4 state 1/4 3/4^2 3^2/4^3 ... for p>1 bits, the shape is: 1/(p-1) (p-1 / p)^n meaning that if p grows, the tail gets longer. It seems that the "interesting effects" is just one-pole low-pass fitering. But hey, that's a nice primitive! Still, it's quite an inefficient way of storing the data. It seems that this can be implemented in a black box way. Actually, it doesn't look like this is correct. The state shrinks in the different cases: 1/4 D=1 3/4 state 1/3 D=2 2/3 state 1/2 D=3 1/2 state actually that can't be true. This is only for a non-recursive variant, where we shift out, then write in a random place. The first swapper algo is definitely infinite tail. The question is though: what kind of baseband noise does this introduce? Entry: Inverse of 7 mod 2^32 Date: Tue Jul 10 20:27:57 EDT 2012 7 * x = 1 mod 2^32 How to compute? Only exhaustive? Entry: Counter bitswapper Date: Wed Jul 11 13:33:25 EDT 2012 It seems that picking a random bit from a set of bits can be represented much more efficiently using a counter. Keep track only of the number of bits that are in the bucket, then randomly generate a 0 or 1 based on this level and update the counter. If this behaves the same, then the "impulse response" should be the same also. Though I don't see a way of how to compute it. Entry: FFT / DFT Date: Sat Jul 14 16:27:00 EDT 2012 Type: tex I'm trying some DSP stuff in Haskell and I need to find a nice way to compute some spectrum diagrams from what are essentailly infinite streams. Starting from the Cooley-Tuckey algorithm, how to best express it in terms of infinite streams such that precision can be on demand? The radix 2 FFT is most easily explained in terms of the DFT of $N = 2^n$ size signals, with $x_N[0 \ldots N-1]$ the signal, $X_N[0 \ldots N-1]$ the spectrum and $w_N = e^{- i 2 \pi / N}$ $$X_N[f] = \sum_{k=0}^{N-1} w_N ^ {kf} x_N[k].$$ The divide-and-conquer step is based on taking the even and odd components of the input stream $$X_N[f] = \sum_{k'=0}^{N/2-1} w_{N}^{2k'f} x_N[2k'] + \sum_{k'=0}^{N/2-1} w_{N}^{(2k'+1)f} x_N[2k'+1].$$ Using $w_{N}^2 = w_{N/2}$ this is $$X_N[f] = \sum_{k'=0}^{N/2-1} w_{N/2}^{k'f} x_N[2k'] + w_N^f \sum_{k'=0}^{N/2-1} w_{N/2}^{k'f} x_N[2k'+1].$$ Defining the decimated signals $x_{N/2}[k] = x_N[2k]$ and $x'_{N/2}[k] = x_N[2k+1]$ this gives $$X_N[f] = X_{N/2}[f] + w_N^f X'_{N/2}[f],$$ where we use the convention that capitalization relates signal and DFT. This recurrence relation produces a tree structure in the computational dependencies. A second important set of observations is that $X_{N/2}[f+N/2] = X_{N/2}[f]$ due to periodicity, and $w_N^{N/2} = -1$, which allows the reuse of intermediate results in the expression for $X_N[f]$ to also compute $$X_N[f+N/2] = X_{N/2}[f] - w_N^f X'_{N/2}[f],$$ This relation produces the butterfly structure. Is it possible compute the FFT of $x_4[0,1,2,3]$ by combining FFTs of $x_2[0,1]$ and $x'_2[2,3]$? This would give a simpler conversion between time and frequency domains, incrementally converting a hopped STFT to one with different decimation. Let's derive this special case to see how it would work. It's clear that $X_4[0] = X_2[0] + X'_2[0]$ and $X_4[2] = X_2[1] + X'_2[1]$ as they use the same weighting waveforms: all $(1,\ldots)$ and $(1,-1,\ldots)$. Can the other 2 components be computed? Doesn't seem so, as the waveforms are $(1,i,-1,-i)$ and $(1,-i,-1,i)$, and there's no way to compose that from $(1,1)$ and $(1,-1)$. So it seems it is important to note that the sub--FFTs are FFTs of \emph{decimated} signals. It's the decimation that allows composition like: $$(1,i,-1,-i) = (1,0,-1,0) + i(0,1,0,-1)$$ $$(1,-i,-1,i) = (1,0,-1,0) - i(0,1,0,-1)$$ Another way of explaining it is that the sub-FFTs need to contain low--frequency information. Entry: FFT: recursive construction of analysis functions Date: Sat Jul 14 22:31:22 EDT 2012 Some simplified visualization of composition of analysis functions in the FFT. A bin in an FFT/DFT is the input signal modulated by an oscillatory analysis function, then summed (averaged). The oscillatory functions have the property that time shift is the same as multiplication by a phase rotation factor. The idea is that for N, the basic oscillation has N phase states, here represented as numbers from 0 to N-1. N=2 0 0 0 1 N=4 0 0 0 0 0 1 2 3 0 2 0 2 0 3 2 1 N=8 0 0 0 0 0 0 0 0 0 1 2 3 4 5 6 7 0 2 4 6 0 2 4 6 0 3 6 1 4 7 2 5 0 4 0 4 0 4 0 4 0 5 2 7 4 1 6 3 0 6 4 2 0 6 4 2 0 7 6 5 4 3 2 1 Going from N=2 -> N=4 we have composition of 2 decimated signals with the following base functions. Here '.' means amplitude zero (we use 0 to indicate phase angle 0, amplitude 1). ( Expressed as angle = x \in [0,1] ) 0 . 0 . 0 . 1 . . 0 . 0 . 0 . 1 ( Expressed as angle = x \in [0,3] ) 0 . 0 . 0 . 2 . . 0 . 0 . 0 . 2 To create the 4 base functions for N=4 it is clear that the rotated odd sampled base functions need to be added to the even sampled ones to create all 4 base functions. Rotation of base functions commutes with weighted summation, hence in the FFT the rotation happens on the composition of FFT bins from the 2 shifted contributions, i.e. the combination of 2 weighted sums. even/odd contr rotation of odd base function ------------------------------------------------ N=4 0 . 0 . . 0 . 0 0 0 . 2 . . 1 . 3 1 0 . 0 . . 2 . 2 2 (= phase 0, ampl: -1) 0 . 2 . . 3 . 1 3 (= phase 1, ampl: -1) N=8 0 . 0 . 0 . 0 . . 0 . 0 . 0 . 0 0 0 . 2 . 4 . 6 . . 1 . 3 . 5 . 7 1 0 . 4 . 0 . 4 . . 2 . 6 . 2 . 6 2 0 . 6 . 4 . 2 . . 3 . 1 . 7 . 5 3 ... Note that a phase rotation by N/2 is a multiplication by -1, which allows for some optimizations. So what does an FFT do in plain language? 1) It uses 2 phase-shifted coarse analysis functions for frequency f (expressed wrt. decimation factor N) to construct a new, finer detailed analysis function for frequency f. This is done by applying a phase shift to the second analysis function such that the decimation gaps line up. 2) Using this finer analysis function f it constructs an analysis function f+N/2 by modulating the first one with 1,-1,1,... This doubles the frequency resolution on each step. Entry: Smoothing financial data Date: Sun Jul 22 08:52:46 EDT 2012 Using the heuristic that large amounts span larger time frames, it seems best to parameterize the data based on span ~ amount in such a way that the integral is equal to the size of the transaction. I.e. "shape-invariant" smoothing. How to implement? This needs an evaluation function that sums over all data points, evaluating the interpolating function, maybe a spline. To determine a proper scale it's probably best to sync to income, i.e. a quadratic spline spanning 2 months corresponds to the average monthly payment. Maybe negative and positive amounts should also be covered differently: income lags but expense trails. What do I need - CSV -> Haskell - Haskell -> Gnuplot Entry: Mean/variance estimator Date: Mon Jul 23 11:33:11 EDT 2012 For large number of samples this is just: m = \sum x_i v = \sum (x_i - m)^2 Samples are 10 bit, so there is room for 32 - 10+10 = 12 bits or 2^12 is 4096 samples, though this of course depends on the variance. What I want is some kind of sliding window. The simplest approach is to reset the accumulators. Doing it incrementally requires a delay line and memory to store the samples. So it seems the tradeoff here is bit depth: we get more effective bits if we represent the variance in terms of the current mean, and perform an adjust on every sample. This makes the updates more costly so probably not worth it. Entry: S/D notes Date: Sat Jul 28 20:38:19 EDT 2012 - From [1], not all possible code words are generated in the constant-input case. Can this be turned around and translated to an optimization problem where a signal waveform is generated such that noise is maximally decorrelated from signal, possibly storing these in tables? - Could subsampling/supersampling work, i.e. pitching? What's the effect on noise? If the samples are not short-time correlated, this could possibly work. ( It seems that a lot of applications arise when the correlations can be controlled.. ) - Can an incremental decorrelator work? I.e. starting from a linear approximation, is it possible to "push" parts of the noise to higher frequences? - Does it make sense to use 2-dimensional representations, i.e. complex signals? This gives more "room" to push the noise to? - Decorrelation: "more" decorrelation corresponds to low-pass filtering the signal, and probably also moving the noise into the lower bands. Investigate that latter part. - Can the decorrelator be used with fractional samples? Actually, it is a sigma-delta converter with non-deterministic outputs. Paper notes from trip: - Even when there is no correlation between 2 noise signals and their respective signals, XOR (multiplication) always introduces noise spectrum shifts based on the highest frequencies. However, in practice this might be controlled (i.e. reso lowpass), and not even a problem (human hearing drops to 0, doesn't roll of gently). - Sound synthesis, why the trouble? There should be no aliasing due to nonlinearities + AND gives a "free" nonlinearity. [1] isbn://0792393090 Entry: The non-deterministic S/D modulator Date: Sat Jul 28 21:53:32 EDT 2012 The decorrelator mentioned before can be generalized by taking a fractional bit input, and producing a random output bit based on the current accumulator. This is the basic building block for the whole S/D synth approach. How to make this well-defined? I.e. for a deterministic encoder a bit is generated if a threshold is reached. This needs to be reformulated. E.g. if state is 1.00 and input is 1.00, the probability of a 1 bit should be 100%, which then leaves the state at 50%. If state is 0 and input is 0, the probability of a 1 bit should be 0% If state is seeded by 0, and input is 0.50 on every step, by symmetry the probability of the output should always be 50%. What about this: - If state > 1 -> deterministic 1 - If state < 0 -> deterministic 0 - Else -> value = uniform probability for 1 This seems ad-hoc, however the hard limits are really caused by the hard limits of saturated 0/1 signals. It sees that in the non-deterministic case, the state needs to be able to "buffer" an "unlikely 1". Is there a way to make this buffer larger to spread it out more? I would guess there is going to be some tradeoff in the probability distribution. E.g. state is 0.01 and input is constant 0.01 which means there is a 1/50 chance the next pulse is a 1. If it is, then the next 99 pulses are going to be guaranteed 0. This seems too rigid. It is "correlation due to saturation". Is it possible to spread out pulses over a longer period of time? What is the difference with the decorrelator? It has an n-bit bucket (represented as a log(n) bit counter) and generates bits based on this counter. Big difference is that a separate decorrelator has the non-deterministic output outside of the S/D feedback loop... It's probably time to let this sit a bit and/or do some simulations. Entry: Combine S/D mod + decorrelator Date: Sun Jul 29 17:28:29 EDT 2012 Issue: can SDM and decorrelator be combined in one or is it necessary to place them in series? What's the difference? Combined, there seems to be an arbitrary element about the "spread" of the energy mismatch, i.e. the tradeoff between decorrelation and exactness. Also, how does the mismatch effect noise shape? Integrating first-order S/D: s' = s + in s'' = s' - out out = f s' Here 'f' is a random generator using its stored energy as a probability to generate a "1" bit. Probability should approach 0 when bucket approaches 1 from above, and falls off based on the depth of the bucket. Main question seems to be: is the bucket bounded? I.e. does the chance for 0 or 1 approach 100% at some point. Seems best to make it symmetric: for generating a "1", s=0 is the p=50% point, s=-1 is p=0% and +1 is p=100%. Then spreading the bucket is to spread this out more. Entry: Switchcap Date: Mon Jul 30 19:53:05 EDT 2012 Paraphrasing Dan Piponi, sometimes you can learn more from trying to decipher a single sentence in an advanced paper that's way over your head, than from going through introductory or course material that is intended to teach you things gradually. I've seen this happen so many times: it's wrong and confusing until suddenly it's right and clear. Bing! For S/D I was reading this book[1] and it made me realize that a switchcap filter is discrete in time but continous in amplitude, while any other digital signal is both discrete in time and amplitude. However, the Z-transform used in digital signal processing is actually defined in terms of real number amplitudes which makes a switchcap filter a better representative of a discrete filter than an ordinary finite-precision digital filter! One of those things that's obvious and fairly basic in retrospect but hiding in a corner until it was made explicit.. [1] isbn://0792393090 [2] http://en.wikipedia.org/wiki/Switched_capacitor Entry: Random book? Date: Mon Jul 30 20:10:54 EDT 2012 Bookshelf diving. Right rack, starting at top from top to bottom. Book 6, page 76, Knuth TACP Vol1, Basic Concepts, Harmonic numbers, the Riemann zeta function. Book 13, page 105, Abelson&Sussman SICP, Example: Symbolic Differentiation, Book 18, page 95, Conway, A Course in Functional Analysis, The principle of uniform boundedness. Entry: Amplitude vs. frequency distribution Date: Tue Aug 7 19:26:11 EDT 2012 Probability distribution of amplitude has little to do with frequency distribution, which is related to conditional probabilities of different moments in time. ( Obvious, but apparently my intuition got off again.. ) Entry: Smoothing threshold detections Date: Wed Aug 8 10:48:22 EDT 2012 Input: threshold detections of maximum signal excursion, i.e. for a measurement sequence x[n], the input is the following reduction: in = max_{n=1:N} |x[0] - x[n]|} > thresh Entry: Expected value of maximum Date: Thu Aug 9 17:58:09 EDT 2012 How to compute the expected value of a normally distributed random variable, given the mean, variance and the number of samples taken? Entry: New method for solving linear equations over finite fields F_p, p prime Date: Fri Aug 10 09:50:40 EDT 2012 Fount this[1] on HN this morning. Maybe look into that a bit more, as I've recently found that part of my Sigma/Delta synth is going to be an algorithm with needs for noise. I believe Knuth has some topics on random algorithms in TAOCP. [1] http://rjlipton.wordpress.com/2012/08/09/a-new-way-to-solve-linear-equations Entry: Phases of 60 seconds Date: Tue Aug 14 09:45:18 EDT 2012 What's the simplest way to divide 60 seconds into a number of phases without performing division, i.e. only shifts? To do it exactly it's only possible to go up to 4/minute, by masking out everything except the 2 lowest bits, and comparing these to 0. Next up is 8 which will do: 0, 8, 16, 24, 32, 40, 48, 56, 60, ... so the last interval is 4 seconds instead of 8. Might not be so desirable. Entry: Meaning Date: Sat Aug 25 08:52:06 EDT 2012 Reply to Antti's article. - Ad-hoc meaning (Forth / Machine langauge / Lisp). The appeal here is definitely conceptual simplicity and expressive power. The big idea I think is feedback: should interpretation be strictly layered (ML / Haskell style, or even to some extent the Racket Scheme staged macros), or should it feed back onto itself: no distinction between data and code. This feedback loop seems to be essential to keeping things conceptually simple at the base level. - Types. While typed languages are in some sense less elegant due to their ad-hoc structure, these systems can be useful for generating a different kind of expressive power. E.g. dispatching based on function return type for Haskell style type classes. This kind of "precognition" is of a different nature than purely local meaning available in feedback systems. However, heterogenous type/meta systems impose a lot of problems. Most importantly maybe: simplicity of the core system is lost (e.g. again Haskell: type system is very complex compared to the lambda calculuse it "protects"). Basically, compared to the conceptual beauty of a feedback system (same language is base and meta), a layered type system has a different kind of freedom: meta level and language level do not need to correspond, and thus can very widely. ( Note that in Haskell there are essentially 3 levels: 1. Base lambda calculus, 2. the type system visible by the programmer, and 3. the implementation of all the different type system extensions which map abstract logic stuff to more simple constructs. To be fair, 3. needs to be counted; it seems to be necessary to keep piling extension onto extension to work around the limitations of the type system. This has to be done as part of the compiler, not as part of a program. It is not very modular! And this work cannot be done by mortals either... ) - On the practical side, after using both approaches fairly intensively I'm still not sure which one is better, but it's obvious they are different and lead to a different way of thinking. One advantage of strong type systems and impossibility of "language feedback" is that they seem to be easier to read. The ad-hoc nature of the "implicit meaning" that is often present only at the time the programmer writes an ad-hoc feedback structure is very hard to reconstruct from the code, and is often a "short term memory" organization in the programmer's head. ( Of course here "programmer" is me in my own experience. ) From my experience, writing very "semantically stacked" code in a dynamic/feedback language is a sure way of pissing off people that need to maintain your code after you're done with it! As for writing in Haskell vs. writing in Scheme. In Haskell I get a good feeling of "wow, finally" being able to express the structure of what I want to do. This often leads to a better understanding, but being required to do so can be a pain in the ass. In Scheme things go a lot faster but often I loose track because of some hidden inconsistencies or differences between what I think it does and what it is actually doing. The latter almost never happens in Haskell because it's meaning is more rigid. Entry: The Joy of Cats Date: Sun Sep 2 17:42:06 CEST 2012 [1] http://katmat.math.uni-bremen.de/acc/acc.pdf Entry: Stochastic vs. Deterministic Date: Sun Sep 2 17:42:47 CEST 2012 Goal: Find a good mathematical model for representing real-valued, discrete time "low pass" signal s[n] encoded in fixed sample rate binary signal b[n]. This consists mainly of finding a morphism between operations on b[n] and real-valued continuous time signals, i.e. map the signals + a collection of meaningful operations. Status: The main heuristic is that "proper decorrelation" is a good idea to build an interesting morphism. b[n] = s[n] + e[n] b[n] and s[n] are related through some kind of linear smoothing / low-pass filtering operation. An important property is that s[n] is recoverable from b[n] such that R (b[n]) = s[n] + e'[n] with |e'| << |e| meaning that a recovery function function R takes the binary sequence b and pruces an approximation to s that is much better than b itself. Ideally |e'| = 0. Currently it is assumed that R is a linear low pass filter, which means that e is a high pass signal: all the noise is in the higher frequency bands. Representing b[n] \in {1,+1} allows the use of the real multiplication operator to represent the boolean function AND. To make multiplication of two binary signals b1[n] and b2[n] meaningful, the correlation of their noise components should be meaningful: b1 = s1 + e1 b2 = s2 + e2 b1 b2 = s1 s2 + s1 e2 + s2 e1 + e1 e2 b12 = s12 + e12 In some sense, the resulting signal should have the same properties, such that R(b1 b2) ~= s1 s2 Sticking to the linear filter reconstruction property, we need e12 to be high pass so it can be filtered out by R. This gives 3 conditions: R(s1 e2) ~= 0 R(s2 e1) ~= 0 R(e1 e2) ~= 0 of course still satisfying the reconstruction properties for the signals themselves, next to the reconstruction of the product above: R(e1) ~= 0 R(e2) ~= 0 If R is linear, these equations are weighted infinite sums. The question is now: given arbitrary s1 and s2, is it possible to construct b1 = s1 + e1 and b2 = s2 + e2 such that the constraints are met? My guess is that it is not possible to do this exactly and thus this needs an error measure to quantify the approximation "~=". Entry: Where is the low frequency noise in S/D approximation? Date: Tue Sep 4 14:07:56 CEST 2012 ( Context: understanding properties of the binary sequence produced by a Sigma/Delta modulator. ) Supposedly (following from linear approximation and white quantisation noise assumptions) the frequency envelope of the S/D approximation noise is F^{-1}, where F is the loop filter, e.g. lowpass one pole or integrator. Then I seem to have a paradox. Suppose we're approximating a constant signal s[n] = 2/3 with the periodic sequence b = 1,1,0,... This gives a periodic approximation error e = 1/3, 1/3, -2/3, ... (following b = s + e) This b is the output of a S/D modulator with F = 1 / z-1, a pure (summing) integrator with pole at DC: w_0 = 0, e^{j w_0} = z_0 = 1. Where is the low frequency component of e? The signal is periodic with period 3, meaning all frequency components are multiples of fs/3 which is high, meaning there are no low frequency components in e. Is this because linearizated analysis is too crude an approximation? This is a proper S/D signal? It is the output of a S/D modulator with F = 1 / z-1, a pure (summing) integrator with pole at DC: w_0 = 0, e^{j w_0} = z_0 = 1. Is it possible to construct an error signal that does have a low-frequency component? I'm guessing that this would be the case for irrational constant signals. Just as in the analysis above, if s[n] is constant, the frequency content of b and e are the same, so it is enough to limit the analysis to b. Picking F = (z-1)^-1 allows a similar simple accumulation model to generate the binary sequence. As irrational number, pick 1/sqrt(2). What is the sequence? This can be represented by a graph of y = D int(x/sqrt(2)). Each impulse is a 1. This sequence will be a-periodic, which means that it will have non-vanishing arbitrarily low frequency content: for any period p, no matter how large, we can never have 1/sqrt(2) - \sum_{k=1}^p b[k] = 0 To construct an approximation to a constant signal that has some some low frequency content in the error signal, we simply need to pick the period large enough. If s[n] = q/p with q= p s'' = s' - b * p -- N periods sdn n q p = take (fromInteger $ n * p) $ sd q p -- Print ps = concat . (map f) where f 0 = "." f _ = "X" -- Build a spectrum: wave forms for values from 0/p to p/p spec n p = concat $ map (++ "\n") lines where lines = map (\q -> ps $ sdn n q p) [0..p] s n p = putStr $ spec n p Example: 2 periods of all x/37 waveforms *Main> s 2 37 .......................................................................... ....................................X....................................X ..................X.................X..................X.................X ............X...........X...........X............X...........X...........X .........X........X........X........X.........X........X........X........X .......X......X.......X......X......X.......X......X.......X......X......X ......X.....X.....X.....X.....X.....X......X.....X.....X.....X.....X.....X .....X....X....X.....X....X....X....X.....X....X....X.....X....X....X....X ....X....X...X....X....X...X....X...X....X....X...X....X....X...X....X...X ....X...X...X...X...X...X...X...X...X....X...X...X...X...X...X...X...X...X ...X...X...X..X...X...X..X...X...X..X...X...X...X..X...X...X..X...X...X..X ...X..X...X..X..X...X..X..X...X..X..X...X..X...X..X..X...X..X..X...X..X..X ...X..X..X..X..X..X..X..X..X..X..X..X...X..X..X..X..X..X..X..X..X..X..X..X ..X..X..X..X..X..X.X..X..X..X..X..X.X..X..X..X..X..X..X.X..X..X..X..X..X.X ..X..X.X..X..X.X..X..X.X..X..X.X..X.X..X..X.X..X..X.X..X..X.X..X..X.X..X.X ..X.X..X.X..X.X..X.X..X.X..X.X..X.X.X..X.X..X.X..X.X..X.X..X.X..X.X..X.X.X ..X.X.X..X.X.X..X.X.X..X.X.X..X.X.X.X..X.X.X..X.X.X..X.X.X..X.X.X..X.X.X.X ..X.X.X.X.X..X.X.X.X.X.X..X.X.X.X.X.X..X.X.X.X.X..X.X.X.X.X.X..X.X.X.X.X.X ..X.X.X.X.X.X.X.X.X.X.X.X.X.X.X.X.X.X..X.X.X.X.X.X.X.X.X.X.X.X.X.X.X.X.X.X .X.X.X.X.X.X.X.X.X.X.X.X.X.X.X.X.X.XX.X.X.X.X.X.X.X.X.X.X.X.X.X.X.X.X.X.XX .X.X.X.X.X.XX.X.X.X.X.X.XX.X.X.X.X.XX.X.X.X.X.X.XX.X.X.X.X.X.XX.X.X.X.X.XX .X.X.X.XX.X.X.XX.X.X.XX.X.X.XX.X.X.XX.X.X.X.XX.X.X.XX.X.X.XX.X.X.XX.X.X.XX .X.X.XX.X.XX.X.XX.X.XX.X.XX.X.XX.X.XX.X.X.XX.X.XX.X.XX.X.XX.X.XX.X.XX.X.XX .X.XX.X.XX.XX.X.XX.XX.X.XX.XX.X.XX.XX.X.XX.X.XX.XX.X.XX.XX.X.XX.XX.X.XX.XX .X.XX.XX.XX.XX.XX.X.XX.XX.XX.XX.XX.XX.X.XX.XX.XX.XX.XX.X.XX.XX.XX.XX.XX.XX .XX.XX.XX.XX.XX.XX.XX.XX.XX.XX.XX.XXX.XX.XX.XX.XX.XX.XX.XX.XX.XX.XX.XX.XXX .XX.XX.XXX.XX.XX.XXX.XX.XX.XXX.XX.XXX.XX.XX.XXX.XX.XX.XXX.XX.XX.XXX.XX.XXX .XX.XXX.XXX.XX.XXX.XXX.XX.XXX.XXX.XXX.XX.XXX.XXX.XX.XXX.XXX.XX.XXX.XXX.XXX .XXX.XXX.XXX.XXX.XXX.XXX.XXX.XXX.XXXX.XXX.XXX.XXX.XXX.XXX.XXX.XXX.XXX.XXXX .XXX.XXXX.XXX.XXXX.XXXX.XXX.XXXX.XXXX.XXX.XXXX.XXX.XXXX.XXXX.XXX.XXXX.XXXX .XXXX.XXXX.XXXX.XXXXX.XXXX.XXXX.XXXXX.XXXX.XXXX.XXXX.XXXXX.XXXX.XXXX.XXXXX .XXXXX.XXXXX.XXXXX.XXXXX.XXXXX.XXXXXX.XXXXX.XXXXX.XXXXX.XXXXX.XXXXX.XXXXXX .XXXXXX.XXXXXX.XXXXXXX.XXXXXX.XXXXXXX.XXXXXX.XXXXXX.XXXXXXX.XXXXXX.XXXXXXX .XXXXXXXX.XXXXXXXX.XXXXXXXX.XXXXXXXXX.XXXXXXXX.XXXXXXXX.XXXXXXXX.XXXXXXXXX .XXXXXXXXXXX.XXXXXXXXXXX.XXXXXXXXXXXX.XXXXXXXXXXX.XXXXXXXXXXX.XXXXXXXXXXXX .XXXXXXXXXXXXXXXXX.XXXXXXXXXXXXXXXXXX.XXXXXXXXXXXXXXXXX.XXXXXXXXXXXXXXXXXX .XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX.XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX Entry: S/D conclusions Date: Tue Sep 4 22:13:26 CEST 2012 * From wave form pictures: nothing looks really special apart from low-denominator rational numbers. * From analysis: it's clear why for low-denominator rational numbers there is no low-frequency content, and why there is some LF content for high-denominator rational numbers, but not (yet) clear that this follows the frequency slope from linear analysis (== inverse of integrator == high pass noise). * The non-leaky 1st order accumulator is really a value -> frequency converter. It has non-vanishing autocorrelation. From this another suggestion for decorrelation can be suggested: integrator state increments/decrements with average 0. Entry: Plotting the approximation error Date: Tue Sep 4 22:55:11 CEST 2012 What to plot? Start with bit depth, e.g. d=16 bits. This gives p=2^d For each wave form assocated to q:1->2^d-1, plot the FFT on a log/log plot. This gives a density plot that gives an idea of the distribution of the error assuming uniform input amplitude distribution. In first approximation maybe a density plot isn't necessary. Just the average over all wave forms might be good enough? Though it would be nice to also have variance and shape of distribution. Entry: How to make a good density plot? Date: Tue Sep 4 23:05:00 CEST 2012 I know of only two ways: - Histogram: tuning is the interval resolution. Trade off: more space resolution gives less density resolution. - Kernel method: Each discrete event is spread out in space. Basic "tuning" parameter is the size of the kernel. Trade off: more space resolution 9smaller kernel size) gives a more bumpy density curve. I can think of some mods: - Adaptive kernel? (E.g. for financial data?) Decrease kernel size in high-density regions. It would be good to figure out a good way to do this in Haskell + Gnuplot. Probably needs the preprocessing (events -> curve) in outside of Gnuplot. Entry: Closed form expression for A1 modulator output + Fourier spectrum Date: Wed Sep 5 14:06:19 CEST 2012 An indication of the "noise" component in a S/D signal from a constant, q/p rational-valued signal, using a non-leaky 1st order integrator / accumulator (an "A1 modulator"), can be computed as the lowest harmonic of the Fourier expansion of the p-periodic binary output signal b[k]. \sum_{k=1}^p b[k] w^k with w=e^{i 2 \pi / p} In a single period b[1] to b[p] we have q nonzero values. This follows from the average q/p and b[i] \in {0,1}. The indices of these non-zero sequence elements are given by I = { ceil( k p / q ) | 1 <= k <= q } The state of an A1 modulator can be represented as the rotation angle of a regular polygon with p sides, where the input (the constant number q/p) represents the constant rotation to be performed at each time step. The output of the modulator is 1 at each time step that a new full revolution has been completed. To find the indices at which these revolutions complete, first approximate them fractionally, solving for n in q/p * n_f = k or n_f = k * p/q, reasoning that a full revolution is completed for each integer k. However, n cannot be fractional so if n_f is not an integer, the full revolution will be completed at the nearest later time index obtained by rounding n_f to the nearest larger integer n = ceil(n_f). This corresponds to the definition of I. ( Note that because q <= p making p/q >= 1, this equation never yields the same index more than once. ) In an example, q/p = 3/11 we have n = k*11/3 k 1 2 3 n_f 1*11/3, 2*11/3, 3*11/3 3.67, 7,33, 11 n = ceil(n_f) = 4, 8, 11 The output bit sequence is: 1 2 3 4 5 6 7 8 9 10 11 0 0 0 1 0 0 0 1 0 0 1 and the amplitude of the lowest harmonic in the Fourier analysis is w^4 + w^7 + w^11 != 0 w = e^{2 \pi / 11} which is almost zero (easy to see when drawing the 3 vectors on paper) but not quite. It would be zero if fractional indices were allowed, making the 3 vectors the sides of a regular triangle. w^{1 * 11/3} + w^{2 * 11/3} + w^{3 * 11/3} = 0 Conclusion: The existence of a closed form expression for the binary sequence (I) avoids having to perform a simulation to generate the bit sequence. -- Closed form expression for indices of non-zero sequence elements. nz' q p = map n [1..q] where n k = ceiling $ k * p / q -- Type conversion nz q p = nz' (i q) (i p) where i = fromIntegral Entry: Approximation of error analysis Date: Thu Sep 6 13:17:31 CEST 2012 \sum_k exp(i 2 \pi ceil(k p /q) / p) - exp(i 2 \pi k / q) The second term is zero when summed over k:1->q so the above gives an exact form for the first harmonic of the approximation. If p is large, the two angles are similar and a linear approximation can be used: exp(i a) - exp (i b) = exp(i b) * (exp (i (a - b)) - 1) ~= exp(i b) * i (a - b) This then gives: \sum_k exp(i 2 \pi k / q) * (i 2 \pi) * (1/p) * fract(k p / q) where fract(x) = ceil(x)-x Note this is another DFT 1st harmonic expression for period q. It's not easy to see how this can be approximated by a constant. The 1/p factor already gives the 1/p characteristic we're looking for, so the rest should somehow be The fract (k p / q) signal is also fairly regular. It has to be periodic with period q, so there should be a straightforward way to compute this. With r = p mod q, this is the same as fract (k r / q). I.e. q = 3, r = 2, this gives the sequence: 2/3, 1/3, 0. So it seems that depending on the input constant, all cases will be covered. Next: closed form (approximate) formula? Entry: Dynwav Date: Tue Oct 2 11:09:29 EDT 2012 The good thing about dynwav (generating wave tables for wave table playback) is that it decouples instantaneous phase/time from instantaneous frequency. Something which isn't possible on the analysis side, meaning information is lost in the process. The bad thing is that it is too flexible, so the problem to solve really is to make generation of spectral shapes simpler, and to tie it into the current playback pitch. One of the effect's I was thinking about is a smooth morph from unimodal to bi-modal spectra, to make it more voice-like. The most important technical problem is T/F allocation, i.e. size of the DFTs based on the pitch: there is no point using a large wave table for high-pitched sounds, as most of the higher pitches need to be zeroed out to prevent aliasing. Bringing this to market, it might be good to keep the engine internal and not make the frontend too complicated. Just add plugins with synthesis algos that perform the spectrum update, so the focus is the engine. pitch -> CHUNKER -> out Based on pitch, the chunker will use different chunking sizes. The idea is to keep the table size similar to the time update. - chorus / flanger: almost for free as multiple readouts - chords: meaning same-spectrum chords. these can be performed by sorting the pitches in ascending order, generate them in order, and pre-filtering the spectrum to avoid aliasing. Entry: 2 forms of LFSRs Date: Wed Nov 7 19:58:48 EST 2012 Fibonacci: shifts IN a bit based on the XOR of a pattern[1] Galois: a direct implementation of multiplication by x to produce a sequence 1,x,x^2, ... modulo p(x) take the bit that shifts out and use to contidionally XOR the lower coefficients of the polynomial with the state vector (Used in Staapl). The former is a Fibonacci LFSR, while the latter is a Galois LFSR. Both are easy in hardware, but the latter is easy in software due to the availability of a "conditional parallel XOR". [1] http://en.wikipedia.org/wiki/Linear_feedback_shift_register Entry: Oversampled signals Date: Sun Nov 11 20:38:34 EST 2012 A while ago I read something about the 'd'-transform which is essentially the 'z' transform where the unit operator is not a delay but a difference. d = z - 1 This encoding avoids the necessity for high precision coefficients, i.e. sensitivity of coefficient to transfer function is reduced. Obviously, some kind of high precision is necessary to make room for the "DC component" such that the basic difference op is a naked subtract. For low-frequency signals, most of the action is around z=1, which means around d=0. Around the origin, the d domain looks like the analog s domain (where s is differentiation). ( I'd like to use this oversampled "virtual analog" simulation, using the MMX instruction set. It has only a 16 x 16 -> 32 bit multiplication, but this would fit nicely. The reason for using fixed point is to be able to use cheap saturation for emulation of power-supply clipping and and simple wrap operations for implementing phasors. It should also make porting to fixed point DSP simpler. ) So, basically, this would work mostly as an analog computer. Let's rebuild some of that intuition. 1. If 'd' is a difference / 's' is a derivative operator, why do analog circuits have integrators? In engineering schoold they tol be this is because differentiation is too noisy. While this might be true, it's not a good thing to tell a novice. The real reason is that the underlying model is a differential equation, i.e. d/dt (x) = A(x) + B(u) Here x is a state vector, u is an input vector, d/dt is the derivative operator and A(.) and B(.) are vector->vector functions of the appropriate arity. If you integrate the whole thing you get an explicit expression of the x in terms of integrals of functions of x. For an analog circuit, that equation directly describes the voltages (or currents) at a specific point. 2. Is it OK to pretend that d == s ? Yes, keeping in mind that the stability requirements are slightly different. For the human ear however, as long as the poles are close enough to d = 0, the difference in transfer function between the (oversampled) digital d operator and the analog s operator is very small. In other words, the d plane is just a shifted z plane, but looks very similar to the s plane for small d. Entry: Using bilinear texture sampling for anti-aliasing Date: Thu Jan 3 17:29:42 EST 2013 This would work for anit-aliasing lines, but not for complex shapes; they need mipmapping. EDIT: Indeed, works well. The thing to keep in mind here is to keep the aspect ratio. Mipmapping uses squares, so when you drop resolution on one axis, the same hapens for the other one causing visible gradients/blur. Maybe for what I want to do, a relational model will be a lot easier to manage. Rendering is essentially off-line so there is no time constraint. Entry: Relational anti-aliased rendering Date: Fri Jan 4 11:39:46 EST 2013 The "stupid slow" solution to anti-aliased rendering of basic geometric shapes. The idea is to render (unions / intersections) of coordinate member functions. It seems simplest to not assume anything about the primitive inequalities (borders) used for finding pixel values, but to simply do crude oversampling. To get to 256 greyscale levels one needs 8x8 oversampling. Entry: Synchronizing "simultaneous" distributed updates. Date: Fri Jan 11 17:13:19 EST 2013 Suppose 2 data stores D and D' are linked by a non-instantaneous link which is used to keep the state in both stores the same D == D'. However, it might happen that a data item d @ D gets updated at the same time as its twin d' @ D', both to different values. The finite time delay between the updates makes this problem a practical one. How does this usually get solved? Supposedly there is some kind of arbitration order imposed on the updates, either by site or in real time. EDIT: I found a solution I like because of its simple symmetric implementation: - store prev and cur values in both A and B sides. - if (cur != prev) { send(cur); prev = cur; } This works well for non-simultaneous updates. E.g. if a value in A changes, a message A->B is sent once. After B receives, it will send this value back to A. This has the problem of concurrent updates causing an oscillation if queue read and write are interleaved as follows: A: B->set(1) B: A->set(2) A: receive set(2) A: B->set(2) B: receive set(1) B: A->set(1) ... To break this oscillation some kind of order needs to be introduced: set one of the two as master. E.g. A will always propagate on receive, but B will never. Note that one of the two needs to bounce the value, otherwise simultaneous value changes will not be resolved, and both sides will end up with different states. Entry: Declarative line rendering Date: Sun Jan 13 09:51:52 EST 2013 In OpenGL it is quite straightforward to use a 2-step rendering process. As opposed to using (straight-line) OpenGL primitives, this creates a bit more freedom. 1. Create a texture bitmap 2. Render the texture bitmap to screen Creating texture bitmaps can be an offline process, so is not so much resource constrained. The approach I'm using is to create anti-aliased bitmaps directly from a specification function shape :: coord -> bool by evaluating it over a supersampled grid and performing an anti-aliasing averaging step. In C such a specirication function would be: bool circle(float x, float y) { return x*x + y*y < 1.0; } This works well, especially since it is not limited to straight lines as is the direct style OpenGL drawing approach. However, just like in the direct style, it can be a bit of a pain because it requires specification of both "edges" of lines. I want to just have one equation for a line, and parameterize that with a thickness. How to do this? - Abstract shapes in a function that just parameterizes thickness directly. - Use some kind of derivative trick to find the curve's normal. Entry: Integrated wavetable Date: Wed Jan 23 03:17:40 CET 2013 Following the anti-aliased sawtooth approach, this technique of course generalizes to pre-integration / post differentiation [1][2]. I wonder if it's possible to solve the resolution problem by using some kind of multi-scale representation, where the difference components are stored "amplified", and differentiation is performed on this representation. Important conclusions: higher order is better, but sensitivity to polynomial interpolation order increases, and sensitivity to numerical precision increases at high frequencies, [1] http://dafx12.york.ac.uk/papers/dafx12_submission_69.pdf [2] md5://b5f03aed795cd98807b6af3648593c57 [3] entry://20110104-153108 Entry: Hoeldrich method for phase generation Date: Sun Jan 20 23:40:13 CET 2013 As found in the Pd code, d_osc.c phasor~ [1][2]. The trick is to reset part of the word to a know value after update. #define UNITBIT32 1572864. /* 3*2^19; bit 32 has place value 1 */ [1] https://github.com/libpd/libpd/blob/master/pure-data/src/d_osc.c [2] http://music.columbia.edu/pipermail/music-dsp/2004-November/061991.html Entry: Faust, or point-free DSP Date: Mon Jan 21 14:12:49 CET 2013 While I very much like the idea, my current opinion is that lambda is king, and if one wants a combinator approach, it should be written on top of lambda. Both named/unnamed data approaches are valid, but writing unnamed on top of named (lambda) is easier than the other way around. I also no longer see the point of writing a DSL that is not embedded in an "old" tried-and-true functional language (Haskell or Racket). Writing languages and compilers for DSL is hard enough without having to also re-implement the low level details like symbol management and input parsing. That said, it might be an interesting application to implement Faust on top of the stream language I'm writing here[2]. [1] http://en.wikipedia.org/wiki/FAUST_(programming_language) [2] http://zwizwa.be/darcs/meta/rai Entry: Quantum Information Date: Sun Feb 24 23:53:24 CET 2013 An interesting talk[1] about interpretation of QM. Slides here[6]. Merits some investigation, some strong claims made here :) Seems that most of it comes from [7] and [12]. "We" are not made of atoms, we are made of (classical) bits. "Correlations without correlata" -- David Mermin[2]. That last phrase is in [11] and in a 1998 publication by Mermin[12]. Then he goes a bit on a roll: The cassical world is not real. We are just information. We are our thoughts. We are a simulation running on a quantum computer. Some quotes skipped in the presentation: ... the particle-like behavior of quantum systems is an illusion created by the incomplete observation of a quantum (entangled) system with a macroscopic number of degrees of freedom. ... randomness is not an essential cornerstone of quantum measurement but rather an illusion created by it -- Nicholas Cerf and Chris Adami I could trace the first quote down to Cerf and Adami paper[7]. I did not find many things that cite it, so not sure if this is a fringe thing... One is here[8]. Quotes skipped in the presentation: ... the particle-like behavior of quantum systems is an illusion created by the incomplete observation of a quantum (entangled) system with a macroscopic number of degrees of freedom. ... randomness is not an essential cornerstone of quantum measurement but rather an illusion created by it -- Nicholas Cerf and Chris Adami He also mentions Mermin's book[3]: Boojums All the Way through. Some Arxiv papers[4] on QI. Some of these mention QBism[5]. I could trace the first quote down to Cerf and Adami paper[7]. I did not find many things that cite it, so not sure if this is a fringe thing... One is here[8]. That paper is not in Cerf's publication list (through [9]) but it is probably similar to a bunch from that period that are, like [10] (paywalled). [1] http://www.youtube.com/watch?v=dEaecUuEqfc [2] http://en.wikipedia.org/wiki/David_Mermin [3] http://www.cambridge.org/gb/knowledge/isbn/item1139045/?site_locale=en_GB [4] http://arxiv.org/find/quant-ph/1/au:+Mermin_N/0/1/0/all/0/1 [5] http://en.wikipedia.org/wiki/Quantum_Bayesianism [6] http://www.slideshare.net/UnitB166ER/the-quantum-conspiracy-what-popularizers-of-quantum-mechanics-dont-want-you-to-know-by-ron-garret [7] http://arxiv.org/abs/quant-ph/9605002 [8] http://www.sfu.ca/~pbastani/cmpt881.pdf [9] http://quic.ulb.ac.be/members/ncer [10] http://dl.acm.org/citation.cfm?id=300811 [11] http://chaos.swarthmore.edu/courses/phys134/papers/mohr1.pdf [12] http://arxiv.org/abs/quant-ph/9801057 Entry: Time-variant filters Date: Thu Feb 28 20:47:36 CET 2013 What are the effects of time-variance of parameters on linear filters? Recent pubs are mostly about non-linearities in the signal path, but time variance also creates effects that depend on the circuit topology. Entry: Exponentials Date: Thu Mar 7 09:29:07 EST 2013 I'm running into some interesting problems with the audio synth stuff I'm working on. At some points in the program, (complex) exponentials are necessary. Some observations: - exp / sin / cos from math.h, libm.so seem to be a bit slow. - straight line curves in the form of complex exponentials, e.g. for parameter interpolation, can be generated incrementally using just multiply add. - for "algorithm" parameter transformation such as initializing interpolating exponential curves, the accuracy is key. - for "human" parameter transformation, i.e. things like mapping a control value to a frequency value, the accuracy of exp probably doesn't need to be so high, and approximation could be used. - Some interesting comments on stackexchange[1]. - Maybe this can be split into 2 parts: linear evolution based on previous coefficient + computation of updated coefficient using some iteration step, based on some setpoint. Conclusion seems to be that yes, exp and sin are expensive. [1] http://math.stackexchange.com/questions/18445/fastest-way-to-calculate-ex-upto-arbitrary-number-of-decimals Entry: Filter transforms Date: Thu Mar 7 21:54:11 EST 2013 - Forward/backward diff - Bilinear[1] - Impulse invariance (exponential pole mapping) [2] - Matched-Z (exponential pole/zero mapping) [3] Forward/backward diff is simplest, but has stability issues for very resonant poles. Only works in the limit when poles/zeros go to s=0 Bilinear maps zeros at infinity to zeros at -1 while exponential essentially samples the impulse response. [1] http://en.wikipedia.org/wiki/Bilinear_transform [2] http://en.wikipedia.org/wiki/Impulse_invariance [3] http://en.wikipedia.org/wiki/Matched_Z-transform_method Entry: Conservative nonlinearities - hamiltonian systems? Date: Thu Mar 7 22:00:33 EST 2013 Nonlinearities are cool but - they can produce aliasing - non-conservative systems are hard to manage Entry: Exponentials: exp(spline(t)) Date: Fri Mar 8 11:12:36 EST 2013 Goal: compute the pole path of a 2nd order IIR filter as an exponential of a complex spline curve, approximating a parameter curve which is itself exponential or sinusoidal. Simplification: exact curve tracking is not so important (say 1-2% is ok), but instabilities should not accumulate. Two subproblems: - Kernel routine: exp ( a0 + a1 t + a2 t^2 ) for ai complex and t typically 0->63. - Spline parameter update at block boundaries, compensating for any inaccuracies of the kernel routine. a1 and a2 are probably << 1, so some approximation might be used. Questions: * amplitude accuracy is probably more important than phase. Use followed by normalization? * use linear or multiplicative path. problem with linear path == more attenuation. * are there any useful bounds on the delta over 64 samples? e.g. more than pi/2 is in 64 samples is probably exceptional. it seems delta is going to be small enough to allow for approximation of exponentials occuring in endpoint/increment operations. -> what is the most natural thing to do here? ignoring numerical errors, it's probably possible to get exact figures too. Practically: limit to linear spline This makes the update equations for the complex fiter coefficient curve something like this (for maginitude 1, c=cos(dp), s=sin(dp) x+ = cx + sy y+ = cy - sx For numerical stability, best ordered as increments: dx = (c-1)x + sy dy = (c-1)y - sy here, (c + i s) comes from (c' + i s')^{1/64} questions: - when c and s are taken from approximations, what does (c + i s)^64 look like, ideally and numerically? - can we make sure that errors do not introduce instability, i.e. poles should stay inside the unit circle, or, filters should be amplitude-limited. - should phasor and amplitude be computed separately? - subproblem: given current phase/amplitude, compute linear curve to setpoint. Subproblem: linear curve to setpoint - In: phasor p(0), desired phase, log amplitude - Out: p(n) = p(0)q^n Given the polynomial used to compute sin/cos-1/exp, this is just a nonlinear equation, which can probably be solved using newton-raphson. Path: - construct equation - construct derivative expression (AI) - construct NR expression Find q s.t. p q ^n = exp (P') = p' Here p' is computed using some exp() approximation that is good enough for "humanly" accurate parameter mapping. Trouble is the presence of q, otherwise the ^n could be absorbed by the exp. What about expressing this differently: q = (exp(P') / p) ^ {1/n} q is very close to 1, so it might be better to express q'(P',p) = (exp(P') / p) ^ {1/n} - 1 To simplify, forget about the correction that takes into account current p vs desired p (previous setpoint): apply it later. The focus should really be on the evaluation of the exponential. q01(P1) = exp((P1-P0) / n) Given q01, we have p1 = p0 q01^n, which can be compared to the desired value exp(P1). If we don't make any correction here, eventually we'll get off track: any systematic error in q_n,n+1 is going to be accumulative. Correction: approximate (exp(P1)/p1)^{1/n}, and use it to scale q01. Up to approximation, this will cancel *all* the errors accumulated. The approximation could use the log of p1, i.e. exp((P1-log(p1)/n)). The weak point in all of this is log(p1), since p1 has the largest deviation wrt. the base of any expansion = 1. What about incrementally computing the estimate of exp(P1) by successive squaring? It could even be spread over time. Conclusion: - Core routine is exp((P_k+1-P_k) / n), which is a small number -> gives fast convergence. - There is systematic drift. Some absolute value feedback is necessary to compensate. Combine setpoint exp(P_k) with computed value p_k. The exp(P_k) has larger error, but might be accurate enough to perform compensation. - Stability is probably most important: approximation exp'(x) such that |exp'(x)| < |exp(x)| Entry: Exponentials: approximation error should not introduce instability Date: Fri Mar 8 13:24:40 EST 2013 approximation exp'(x) ~= exp(x) s.t. |exp'(x)| < |exp(x)| One question that bugs me is that exp(x) = exp(x/2) * exp(x/2). The smaller the argument of x gets, the faster the series converges. At what point does this successive squaring get imprecise? I.e. exp(x/2^n) might be more precise, but the squaring looses information again. Entry: Exponentials: reset at border Date: Sat Mar 9 10:45:18 EST 2013 Compute q_k = exp((P_k_+1 - P_k) / n) such that a complex exponential can be generated as p_k+1 = p_k * q_k^n Assuming p_k = exp(P_k) and if the approximation of exp() is good, p_k+1 will be pretty close to exp(P_k+1). Maybe it is good enough to just set p_k+1 to this value? This gives a simple algorithm where noise (click) is only dependent on the accuracy of exp(), assuming that the numerical problem due to successive multiplication by q is small enough. Simple enough to test. Entry: Exponentials: Reset + Nth root Date: Sat Mar 9 17:18:06 EST 2013 Problem with previous reset algo is that ^exp(P1) and ^exp((P1-P0)/n) can have strong difference in precision. Alternative method: Compute exp(P1) using a "human precise" approximation, i.e. something that is good enough as real end point, then interpolate from exp(P0) through the Nth root of ^exp(P1) / ^exp(P0). This takes care of accumulating errors, but requires more work. However, this might be solved by using a Newton-Raphson step from an initial value of ^exp((P1-P0)/N). Entry: Exponentials: push drift compensation upward Date: Sat Mar 9 17:26:12 EST 2013 Some experiments show that for the values involved, an order 5-10 Taylor series is going to be quite good and approximation. The remaining problem then is drift. These functions are to be used in LFOs and envelopes in a sound synth, so there might be some trade-offs to be exploited, i.e. for an envelope, the end state is 0 increment, which should remain stable. Maybe it's even possible to compute where we'll end up after one cycle, to compensate for that externally? So, given ^exp(), approximation of exp(), a set-point curve at block boundaries (N), and an initial state, it is possible to compute the expected deviation after one "cycle" of the control wave. This can probably be used externally to compensate the control wave form on a larger time scale. Entry: Exponentials: soft reset (lowpass) Date: Sat Mar 9 17:48:47 EST 2013 Alternative method: Instead of resetting, use a "soft reset", i.e. a lowpass filter that pulls the state into the right direction. This should be enough to prevent drift, without introducing too much dependence on the accuracy difference of the ^exp(P1) vs ^exp((P1-P0)/N) terms. Entry: Exponentials: compute exp(x) and exp(iy) separately? Date: Sat Mar 9 18:12:44 EST 2013 Maybe not such a bad idea, since both might need separate precision, and it would free up the dependency chain. This essentially computes exp,sin,cos. Using just taylor expansion, stability requirements are easy to satisfy, as for real exp the approx is strictly smaller, while for sin/cos it is satisfied if we stop at a negative term and stay in the first quadrant. Entry: Exponentials: exp(x) vs exp(x/2)^2 Date: Sat Mar 9 18:23:54 EST 2013 For approximations of exp(x) as truncated Taylor series, this might be an interesting problem. The main question is: halving followed by squaring is two multiplications, which is about the same complexity as adding a term to the series (multiply + add). What is the difference in precision? Entry: Synth sampling Date: Sat Mar 9 18:25:53 EST 2013 It seems that sticking with at least 2x oversampling (wrt usable audio rate) is going to be a good thing, as it makes a lot of the math simpler. Entry: Normalization Date: Sat Mar 9 18:27:23 EST 2013 Given a,b s.t. a^2+b^2 ~= 1, is there a fast way to push this closer to the unit circle? Direction doesn't matter so much. ( The a and b are sin and cos from a truncated taylor expansion. It is probably possible to find out which of the two is systematically more precise. ) Exact normalization would be scaling by r = 1 / sqrt(a^2 + b^2). This needs to somehow be approximated. Since r will be close to one, the following taylor series could be used: 1 / (1 + x)^1/2 ~= 1 - 1/2 x ;; Approximate normalization for a^2 + b^2 close to 1 ;; Exact coef: r = (a^2 + b^2)^1/2, a -> r a, b -> r b ;; Since this is close to 1 we can use 1st order Taylor: ;; r = 1 / (1 - x)^1/2 ~= 1 + 1/2 x with x = 1 - (a^2 + b^2) ;; Maybe using a -> a + (r1 * a) with r1 = 1/2 x is more stable? ;; Don't think so, since x does not have many significant bits, it's ;; probably OK to hide it in the LSBs of r. (define normalize-2D (lambda (a b) (let* ((a2 (* a a)) (b2 (* b b)) (x (- 1 (+ a2 b2))) (r (+ 1 (* 0.5 x)))) (values (* r a) (* r b))))) Entry: Exponentials: pole interpolation: linear vs. exp(linear) ? Date: Sat Mar 9 19:22:54 EST 2013 Since all values are going to be quite small, it might be good enough to just use linear interpolation for the pole positions instead of multiplicative updates. The thing is: pole Q is not going to change much. All control effort will be for pole frequency which is most musically interesting. When using linear interpolation for cos/sin also, the curvature will start to play a role when frequency difference is large. This introduces modulation in the Q, and reduces the average Q. A trade-off is probably possible: - linear interpolation for pole Qs - circular interpolation for frequencies + normalization at border r += dr p_x += (c-1) p_x + s p_y p_y += (c-1) p_y - s p_x Entry: Exponentials: update Date: Tue Mar 12 17:29:26 EDT 2013 Given a good estimate of exp(x0), how to compute exp(x0+dx) ? This is easy: exp(x0) * exp(dx). Doing this many times gives the same geometric series as before: exp(x0) * exp(dx)^n Entry: Exponentials: stable update Date: Wed Mar 13 09:27:29 EDT 2013 I'm still not convinced that it's not possible to compute a stable update for frequency increments. Summary of before: * ^exp(( P1 - P0 ) / N) = q01 Is more precise than ^exp(P0) and ^exp(P1) such that ^exp(P0) * q^N and ^exp(P1) do not line up. * using only incremental updates q will cause drift. A "directed" method would solve this, something that is targeted at a set point, but does not have a growing approximation error. This would boil down to computing: ( exp(P1) / p0 ) ^ 1/N This can be simplified to P1 purely imaginary and p0 unit norm. The question seems to be: how to construct a series for this? Using autodiff[1] it's probably possible to automate it. But is a straight series good enough? It might have non-elementary operations. Let's try to massage it a bit first. exp (P1 / N) = \sum_n (iP1/N)^n / n! 1 / p0^1/N = 1 / (1 - (1 - p0^ 1/N)) = \sum_n (1 - p0^1/N)^n The resulting series is a convolution of these two. However, the term p0^1/N is not elementary. This seems like a detour.. Let's try a different way, by writing the error explicitly. It is known that p0 approximates exp(P0) or p0 = e + exp(P0). The e is not known, but could be computed. Can this be used? The error is completely contained in the difference between p0 and exp(P0). This brings us back to one of the previous attempts: compensate the error accumulated up to a point by spreading out the compensation over the next block, i.e. multiply the update coefficient by: (exp(P0) / p0) ^ 1/N This has the same form as the original problem. It doesn't look like it is going to be of much use.. Let's ignore the 1/N as it is probably possible to introduce it later, and go for the expansion of log() : q(P1,p0) = exp(P1 - log (p0)) = exp(P1 - log (1 - (1 - p0))) = sum_n,0 1/n! (P1 + sum_m,1 1/m (1 - p0) ^ m) ^ n I'm thinking that it might be time to start implementing convolution of sequences, and work numerically. EDIT: The result of the log is imaginary, so all real parts could be eliminated. EDIT: It doesn look like this will work since the inner series has a constant term. Conclusion: bad intuition. The connection to "small values" because P1 and p0 appear naked in the expressions. [1] http://zwizwa.be/darcs/meta/rai/ai-autodiff.rkt Entry: Composition of series expansion Date: Wed Mar 13 11:52:32 EDT 2013 Is it actually possible to write this in a finite way? E.g. exp(exp(x)) = sum_n 1/n! (sum_m 1/m! x^m) ^n what is the coefficient of 1, x, x^2, ... ? Hmm... I'm missing an important point here. The trouble is that this is doubly infinite, so when doing this numerically, it is necessary to add a termination criterion. Only in the first coefficient though, meaning f(0), as it will show up as an infinite sum. All other sums will be finite. Finding the contribution for x^n depends on all possible splits of the factorization of n, but is finite. Another problem: the infinite sum might not even converge! Actually, this is mentioned here[1] on composition g(f(X)) A point here is that this operation is only valid when f(X) has no constant term, so that the series for g(f(X)) converges in the topology of R[[X]]. In other words, each cn depends on only a finite number of coefficients of f(X) and g(X). [1] http://en.wikipedia.org/wiki/Formal_power_series#Composition_of_series Entry: Direct computation of 2D series Date: Thu Mar 14 18:47:03 EDT 2013 These are all interesting problems, but not sure if it is useful to solve them. There might be a simpler way to compute the 2D Taylor series directly through differentials (autodiff). ( I'm thinking that diff is more work since there are more terms to compute -- the tensors with partial derivatives -- but maybe the same explosion happens for the function composition of sequences. A serious case of bad intuition here.. ) But... for 2 variables this isn't really such an explosion, is it? 0 f 1 f_x f_y 2 f_xx f_xy f_yy 3 f_xxx f_xxy f_xyy f_yyy Still not convinced this will work for exp((P1 - log(p0)) / N). Somehow not explicitly using the fact that P1 - log(p0) is small seems to be be wrong, i.e. nothing special. Somewhere, some cancelling is going on eventually, but in what way? Entry: Exponentials: final word Date: Fri Mar 15 10:23:26 EDT 2013 The update data dependencies are: (P0, P1, p0, N) -> (q01, p1) The approximation error is: exp(P1) - p1 = e which could also be expressed relative to p1 if it makes sense for compensation expressions: e = e' p1 Because q01 is fairly precise, it should be possible to do one of these: - ignore the drift, solve it at a higher level - hard-reset the drift once by computing exp(P1) using more terms - adding 1-st order compensation for q01 drift in q12 in terms of e. Any other approach explored in the previous entries seems to loose precision due to the inability to express that e is small. I.e. we don't get a series expansion in e, but in P1, p1, which is not helpful. So, time to stop analysis paralysis: it's really a non-issue. Either find an approach that gives an expansion in e, or shut up and see what happens in practice. Entry: Exponentials: computing the log of |p| ~= 1 Date: Fri Mar 15 13:09:45 EDT 2013 Hmm... still something is missing. Given that P ~= log(p) and p=c+si with |p| = 1 so re(P)=0, what happens if this is plugged into the Taylor expansion? log(1 + x) = \sum_n=1 1/n x^n P ~= (c-1 + s i) - 1/2 (c-1 + s i) ^2 + 1/3 (c-1 + s i) ^3 Taking only the imaginary coefficients: = s i - (c-1) s i + (c-1)^2 s i - 1/3 (s^3) From this it seems that computing log(1 + x) for x small is not such a problem. How to use this? It could be used to compute log (p1/p0), and use that to update the estimate of P1 = log(p1). I'm still missing a way to make all the relations concrete: - Using P1 directly avoids drift, but is imprecise - Using (P1-P0)/N has drift, but is more precise. These are two estimators. How to combine two estimators into one in an attempt to avoid drift? The thing is, there is an estimate already. To use this information, an update method should be used. Using N-R, log(P) is a solution of f(x) = 0 with: f(x) = exp(x) - P The N-R update step for this is u(x) = x - f'(x)/f(x) = x - exp(x) / ( exp(x) - P ) Maybe it pays to combine these, i.e. take exp(x) and x to be the estimates we have... To be continued.. Entry: Normalization: a-symmetric Date: Fri Mar 15 14:30:11 EDT 2013 See also [1]. I'm looking for a simpler, a-symmetric approach where only one of the coefficients is updated, e.g. c. Solving for f(c) = 0 in f(c) = (c^2 + s^2) - 1 the N-R update is: u(c) = c - f(c) / f'(c) = c - (c^2 + s^2 - 1) / 2c = 1/2 (c + (1-s^2)/c) This is probably much more precise than the other method, but it uses a division operator. Also, it's symmetric in c and s, so if c->0 we can switch to the s form. The direct form is probably better. [1] entry://20130309-182723 Entry: Normalization: Fast inverse square root Date: Fri Mar 15 20:56:46 EDT 2013 Some nice bit twiddling tricks[1]. Made me wonder if I should use N-R for the normalization problem.. To compute 1 / sqrt(q) where r^2 = c^2 + s^2 ~= 1, it might be good enough to just use the N-R update for f(x) = 1/x^2 - r^2 u(x) = 1/2 x * (3 - r^2 x^2) Since x will be close to one, this gives the series of estimates: x0 = 1 x1 = (3 - r^2) / 2 Here x1 is actally the same as 1st order Taylor approx of 1 / (1 + x)^1/2 with 1 + x = r^2 from [2]. [1] http://en.wikipedia.org/wiki/Fast_inverse_square_root [2] entry://20130309-182723 Entry: Standard filters from pole positions. Date: Tue Mar 19 12:46:18 EDT 2013 How to derive the LP/HP/BP/BS filters from a 2-pole discrete state variable filter? There seem to be a couple of degrees of freedom here that require a bit of extra information. I'm picking the "orthogonal" SVF because it seems easiest to control. Practically, I'm looking for B,C,D in s_k+1 = A s_k + B i_k o_k = C s_k + D i_k where A is [ p_x p_y ] with (p_x, p_y) the pole location. [ -p_y p_x ] with i,o 1-dimensional, this boils down to 2+2+1 parameters. Transfer frunction is: o = (C (zI - A)^1 B + D) i Maybe it's best to start from the relation HP = 1 - LP, and focus on LP, then find out how to derive BP and BS. The latter 2 are not as important as the LP, HP pair. For LP, we need unit gain at z=1 or C (I - A)^1 B + D = 1 Another way to go is to look at the 2D equation in terms as if it were a 1D complex equation. It's probably simpler to think in terms of complex numbers here. Basically, a real 2-pole is the real part of a complex 1-pole. Entry: Complex 1-poles Date: Tue Mar 19 13:24:42 EDT 2013 Something that has puzzled me for a while: * The ideas of "low pass" (LP) and "high pass" (HP) filter are not meanigful for a discrete, complex filter. The ideas of LP,HP,BP,LP stem from real, continuous filters. They refer to the behavior of the transfer function at zero, infinity and a reference frequency f. | 0 f inf ----+---------- LP | 1 ? 0 HP | 0 ? 1 BP | 0 1 0 LP | 1 0 1 This has to do with the symmetry of the transfer functions. For continuous filters, transfer functions are defined on the real line, which have 2 special points apart from a current reference point: zero and infinity. For a complex discrete filter, no special points exist, other than "not the point of interest". This is due to the symmetry: transfer functions, which is apparent by them being defined on the complex unit circle (z-transform). Looking at the symmetry, there are only two "kinds" of discrete filters: "band pass" (BP) and "band stop" (BS). f=f0 f!-f0 ---------------- BS 0 >0 BP 1 <1 * The ideas of LP/HP can be introduced into the discrete world by - Sticking to real signals. This creates 2 special points: DC, NY - Defining a mapping from continuous to discrete. Sticking to real signals creates two special points where the real axis intersects the unit circle: DC (the direct current) and NY (Nyquist frequency). By itself, DC and NY are 2 points that are not essentially different. However, they can be distinguished by relating discrete and continuous transfer functions. Impulse invariance: - DC 0 (but also 2NY, 4NY, ...) - NY not in any way special. infinity is not mapped Bilinear transform: - DC 0 - NY infinity Once this mapping is in place, the ideas of HP and LP can be made to carry over. So it is important when talking about LP and HP filters in a discrete sense to also take into account how these ideas are defined, i.e. through impulse invariance or the bilinear transform. Using the impulse invariance transform which is useful for musical applications, it might be useful to still define properties in terms of behaviour at infinity, even if that doesn't make sense in actuality, as filter transfer functions will alias. Entry: Impulse-invariant high-pass filter: what does it mean? Date: Tue Mar 19 14:04:48 EDT 2013 An interesting conceptual problem: it is possible to define the behavior of a low pass filter based on the impulse-invariant transform (pole mapping). The same doesn't work for a high-pass filter, since the high pass band will alias down into the stopbad, killing the high-pass property. So, how should it be implemented? Maybe it only makes sense using a bilinear transform? Entry: CT proto 2nd order filters. Date: Tue Mar 19 15:17:14 EDT 2013 Thinking about filters, re-building intuition.. These are all 2nd order filters with p a complex pole and p' its complex conjugate. * Band-pass filter (BPF). Gain is zero at DC and infinity. To have zero DC gain a zero at 0 is necessary. To have have zero gain at infinity this should be the only zero. s ---------------- (s - p) (s - p') * High pass filter (HPF). Derivative of BPF. Limits to 1 when s->inf. s^2 ---------------- (s - p) (s - p') High pass filters are closed under series composition (product). * Low pass filter (LPF). Integral of BPF. Limits to 1 when s->0. pp' 1 ---------------- = -------------------- (s - p) (s - p') (1 - s/p) (1 - s/p') A LPF is also an all-pole filter, meaning it has no (finite) zeros. Low pass filters are closed under series composition (product). * All pass filter (APF). Gain=1, only phase shift (s + p) (s + p') ---------------- (s - p) (s - p') All pass filters are closed under series composition (product). * Sum of one-pole filter (S1F). Partial fraction expansion has no zeros. Not sure if this is really so useful, but it does tend to show up from time to time, as they are the only ones closed under parallel composition (sum). s - Re(p) 1/2 1/2 ----------------- = ----- + ----- (s - p) (s - p') s - p s - p' Entry: Partial fraction forms Date: Tue Mar 19 16:49:07 EDT 2013 http://www.wolframalpha.com/widgets/view.jsp?id=ec4a062bb304f88c2ba0b631d7acabbc LP(s) = pp' / (s - p) (s - p') a b = ------- + ------- s - p s - p' => as + bs = 0 => a = -b - ap' - pb' = pp' => a (p - p') = pp' or a = pp' / (p - p') = 2 |p|^2 / Im(p) The wave form in the response is thus 90 degree shifted from a normal exponential, of which the real part is a cosine and the imaginary part a sine. I.e. the response is a sine wave, gradually rising to follow the pulse. It doesn't jump immdiately, as would a high-pass filter. HP(s) = s^2 / (s - p) (s - p') = 1 + p^2 / 2 Im(p) (s-p) + p'^2 / 2 Im(p) (s-p') This has a direct path + a phase-shifted sinusoid. Entry: CT -> DT Date: Tue Mar 19 17:03:47 EDT 2013 Here[1] is mentioned that impulse invariance works for LP and matched-Z[2] works for BP,BS,HP. Bilinear works for anything, as it is 1->1. [1] http://dsp.stackexchange.com/questions/109/are-there-alternatives-to-the-bilinear-transform [2] http://en.wikipedia.org/wiki/Matched_Z-transform_method Entry: Bilinear vs. impulse invariance Date: Tue Mar 19 17:16:35 EDT 2013 What would be the best approach for time-variant filters? Some things that are important are: the ease of mapping frequency and Q parameters to discrete, and the interpolation between different set points. Entry: What does an LP/HP/BP response look like? Date: Tue Mar 19 17:37:00 EDT 2013 Is there any kind of visual quality that can distinguish impulse responses of LP/HP/BP? They are all sums of damped exponentials, so where is the difference? Qualitatively, one of the key points seems to be how phase and damping interact. - The DC response is determined by the average, so when most of the response is above zero, the response will be low-pass. - If the response doesn't jump from 0 immediately, this shows inertia, a property of low-pass behavior. Probably step responses are a lot easier to interpret since they are a bit more intuitive to grasp. Entry: Cheaper filter? Date: Tue Mar 19 22:43:19 EDT 2013 ( EDITED 2013-5-7 - Original was wrong. It completely missed the forward/backward thing and use a forward different transformation. ) According to [1], the direct digital approximation of the SVF is stable for low frequencies, even at q=0 (Q=inf). In Stilson's PhD [3] this is called the Chamberlin filter. Note that this is _not_ a trivial transformation of the CT equations. It has one forward and one backward difference to replace the 2 integrators. See also [2]. In this notation, the uptdate equations are g _sequential_, where the update for s2 is used in the update for s1. This in contrast to the usual formulation as simultaneous updates, i.e. state space form. s2' = s2 + f s1 : LP s1' = s1 - f (q s1 + s2' + i) : HP Interpreting these equations as simultaneous, i.e. using s2 instead of s2' in the second equation, gives a forward difference approximation and makes the filter unstable. The parameters f and q are related to their continuous time counterparts as f = 2 sin ( pi F_c / F_s ) q = 1 / Q Even in the presence of oversampling, this approach is a lot simpler than the direct pole computation and interpolation approach from the last couple of posts. [1] http://www.earlevel.com/main/2003/03/02/the-digital-state-variable-filter/ [2] entry://20130421-101744 [3] entry://20130401-012657 [4] entry://20130408-005953 Entry: Newton-Raphson method Date: Wed Mar 20 13:36:49 EDT 2013 Type: tex Once, to never forget, the derviation. Given a point $x_i$ called the \emph{initial estimate}, and a function $f(x)$, we can make an approximation $x_u$ for $x_0$ s.t. $f(x_0)=0$ by constructing the $1$-st order approximation to $f$ at $x_i$. The equation of this line, translated to $t = x - x_i$ is $$l(t) = f(x_i) + t f'(x_i).$$ Solving for $t_u$ s.t. $l(t_u) = 0$ gives $t_u = - f(x_i)/f'(x_i)$, or $$x_u = u(x_i) = x_i - {f(x_i) \over f'(x_i)}.$$ The function $u$ can now be iteratively applied to refine the estimate. Entry: Applying the N-R method for function evaluation Date: Wed Mar 20 13:50:58 EDT 2013 Type: tex To use the N-R for function evaluation $f(x)$ requires an extra step, which is to construct a function F such that $F(f(x)) = 0$. Is there a systematic way to do this? As an example, take $f(x) = 1/\sqrt{x}$. Construct a function $F(q)$ such that $F(1/\sqrt{x}) = 0$. The straightforward approach would give something like $$F_1(q) = q - 1/\sqrt{x}.$$ Computing the N-R update from $F_1$ is rather useless $$u_1(q) = q - \frac{q - {1/\sqrt{x}}}{1} = 1/\sqrt{x}.$$ The trick seems to be to take this trivially obtained but useless $F_1$ and change it in a way that turns the dependency on $x$ into elementary functions, without removing the zero of $F_1$ we are interested in. The following two steps seem appropriate: remove the square root $$F_2(q) = (q - 1/\sqrt{x})(q +1/\sqrt{x}) = q^2 - 1/x$$ and remove the division operation $$F_3(q) = xq^2 - 1$$ Which of the two is best? There is no way to decide until computing the N-R update steps. It turns out that both lead to the same function $$u_2(q) = u_3(q) = \frac{1}{2}(q + \frac{1}{xq}).$$ Now $u_2(q)$ still contains a division, which is not optimal. There are other ways. In [1] the function $F_4(q) = 1/q^2 - x$ is used, which leads to $$u_4(q) = \frac{1}{2}(3q - xq^3).$$ The heuristic to distill from this is that if there is a division involved, it might be best to express $F(q)$ in terms of $1/q$. The derivative will have a decreasing negative power. The negative power is then cancelled by $F'(q)$ appearing in a denominator in the update formula. That trick should then also work for evaluating $f(x) = 1/x$. Indeed, with $F_5(q) = 1/q - x$ we get $u_5(q) = q(2 - xq)$. %[1] http://en.wikipedia.org/wiki/Fast_inverse_square_root Entry: Saturation update Date: Wed Mar 20 14:43:54 EDT 2013 Type: tex A function that's interesting for computing smooth saturation is $$f(x) = \frac{x}{\sqrt{1 + x^2}}.$$ Using the $1/q$ trick from the previous post, a N-R method can be used to evaluate this from $$F(q) = \frac{1}{q^2} - \frac{1+x^2}{x^2}$$ yielding $$u(q) = \frac{3}{2}q^2(1 - \frac{q^2 (1+x^2)}{x^2}).$$ From this, a disadvantage seems to be that the $1/q$ method doesn't work for $x \to 0$. Hmm\ldots No free lunch. Entry: The Harmonic Oscillator and the Laplace Transform Date: Wed Mar 20 20:02:59 EDT 2013 Type: tex The state space form of the harmonic oscillator consists of the following system of differential equations $$ \begin{array}{rcl} \dot{q_1}(t) &=& - \omega q_2(t) \\ \dot{q_2}(t) &=& \omega q_1(t) \\ \end{array} $$ This can be solved by Laplace-transforming the differential equations to a set of equations $$ \begin{array}{rcl} s Q_1(s) - q_1(0) &=& - \omega Q_2(s) \\ s Q_2(s) - q_2(0) &=& \omega Q_1(s) \\ \end{array} $$ Solving this for $Q_1(s), Q_2(s)$ gives a solution in the Laplace domain, which can be converted back to the time domain. The solutions looks like $$Q_i(s) = \frac{n_i(s)}{s^2 + \omega^2}$$ where the numerators $n_i(q)$ are first order polynomials in terms of the initial conditions $q_i(0)$. The interesting part is the denominator $s^2 + \omega^2 = (s-j\omega)(s+j\omega)$ which determines the \emph{shape} of the result. Splitting the $Q_i(s)$ in partial fractions then gives the sum of first order rational functions proportional to $(s-j\omega)^{-1}$ and $(s+j\omega)^{-1}$, which are Laplace transforms of $e^{j\omega t}$ and $e^{-j\omega t}$ respectively. The denominator can also be obtained as $\det(Is-A)$ where $A$ is the system matrix $$A = \matrix{0 & -\omega \\ \omega & 0}.$$ Entry: The State Variable Filter Date: Wed Mar 20 21:58:08 EDT 2013 Type: tex The state space form of the SVF [1][2][3] consists of the following system of differential equations $$ \begin{array}{rcl} \dot{s_1}(t) &=& - \omega [ 2 \zeta s_1(t) + s_2(t) + i(t) ] \\ \dot{s_2}(t) &=& \omega s_1(t) \\ \end{array} $$ corresponding to a system matrix $$A = \matrix{ - 2 \zeta \omega & -\omega \\ \omega & 0}.$$ Here $2\zeta=1/Q$ with $Q$ the usual pole--based definition of \emph{quality factor}, and $\omega = 2\pi f$ the angular frequency. Personally, I prefer to use the parameter $\zeta$ over $Q$ as it makes the expression of the poles a bit easier to read. The pole equation $$p(z) = \det(A-Is) = s^2 + 2 \zeta \omega s + \omega^2 = 0$$ has two solutions $$s_\pm = \omega (- \zeta \pm \sqrt{\zeta^2 - 1}).$$ If $\zeta < 1$ the poles are complex conjugate with complex angle only dependent on $\zeta$, i.e. unit norm phase component $e^\theta = \zeta+i\sqrt{1-\zeta^2}$ where $\cos\theta = \zeta$. % [1] http://www.earlevel.com/main/2003/03/02/the-digital-state-variable-filter/ % [2] entry://20130319-224319 % [3] http://en.wikipedia.org/wiki/State_variable_filter Entry: SVF discretization + compensation Date: Wed Mar 20 19:18:04 EDT 2013 Type: tex EDIT 2013-5-7: I'm leaving this here for reference, but make sure to note that using the forward difference transform on the CT SVF filter is not a useful approach. See [8]. The Chamberlin filter used a forward and a backward difference, which has different properties than what I've been using in the next couple of SVF posts, which use only forward differences. The CT system described in [3] can be approximated naively [1][2] in the discrete domain using the substitution $s=z-1$, which corresponds to the forward difference transform\footnote{This is not the whole story. Other discretizations are possible. See [7].}. This leads to the update equation $$ \begin{array}{rcl} s_1[k+1] &=& s_1[k] - \omega (2 \zeta s_1[k] + s_2[k] + i[k]) \\ s_2[k+1] &=& s_2[k] + \omega s_1[k] \\ \end{array} $$ corresponding to a discrete feedback matrix $$A = \matrix{ 1 - 2 \zeta \omega& -\omega \\ \omega & 1}.$$ The poles determined by $\det(A-Iz)=0$ can be derived directly from the poles of the continuous system through the substitution $z=s+1$ or $$z_\pm = 1 + s_\pm = 1 + \omega (- \zeta \pm \sqrt{\zeta^2 - 1})$$ This is no longer a constant--$Q$ filter. For any $\zeta$, when $\omega$ gets large, eventually it will intersect the unit circle to cause instabilities. The question is then, how to compensate this? Several options are possible. Running the filter multiple times as suggesteded in [1] will reduce approximation errors. Is it possible to use a squaring approach as opposed to mere linear oversampling such as used in the scale reduction technique in [5]? Yes, but due to damping it requires full $2 \times 2$ matrix multiplication as opposed to simplified complex phasor multiplication. Instead of $A(\omega,\zeta)$ from above, compute the system matrix $$A(\omega',\zeta')^{2^n}$$ using successive swquaring where $\omega' = \omega / 2^n$ and $\zeta$ is adjusted accordingly. There might be some interesting ways to reduce operations by preserving the symmetry of the intermediate squaring results. Another approach is to construct polynomial approximations that compensate $\omega$ and $\zeta$ to lessen the nonlinearities and maybe more importantly, keep the filter stable if $\omega$ is varied, i.e. approximate the constant--$Q$ behaviour of the original analog filter. The question is how to express this property in the discrete domain, as it only makes sense in reference to the poles of the original system, where it refers to the property that the pole angle does not depend on $f$. We can use the impulse invariant transform[4] to map the poles $z_\pm$ of the discrete system to the continuous system poles $s_i$ through the relation $z_i = \exp(s_i)$. The property we're interested in can then be expressed as $$\frac{d}{df} \arg(\log(z_\pm(f,q(f)))) = 0.$$ where $\arg(.)$ computes the angle of a complex number. This is a first order nonlinear differential equation in $q(f)$. It is possible to construct different equations that express the same constraint such as $$\frac{d}{df}\frac{\arg(z_\pm)}{\log(|z_\pm|)} What would be a good way to approximate this? To solve the equation numerically and use it to fit an approximation, or to fill in a parameterized curve and turn it into an error minimization problem. Anyway it's pretty sure that anything that is workable will be a polynomial in $f$. It's probably also a good guess to set the linear coefficient to zero, since there is no compensation necessary near $z=1$. Let's start with computing the magnitude of the pole. It seems there is enough to gain by compensating that first. (TODO) % [1] http://www.earlevel.com/main/2003/03/02/the-digital-state-variable-filter/ % [2] entry://20130319-224319 % [3] entry://20130320-215808 % [4] http://en.wikipedia.org/wiki/Impulse_invariance % [5] entry://20130326-173542 % [6] entry://20130325-094100 % [7] https://ccrma.stanford.edu/~stilti/papers/Welcome.html % [8] http://20130421-101744 Entry: Polynomial bumps and saturation functions Date: Thu Mar 21 16:00:05 EDT 2013 Type: tex A saturation function $f(x)$ defined on an interval $I=[-1,1]$ should have the properties $f'(x) \geq 0$ (monotonous), $f(x) = -f(-x)$ or $f'(x) = f'(-x)$ (symmetric) and $f'(0) = 1$ (small signal identical). Such a function reaches its maximum,minimum $+m,-m$ at the interval end,begin points $+1,-1$. Let's call this $m$ the \emph{squash}, the maximum gain reduction possible under constraints above. Let's call the derivative $f'(x)$ is a \emph{bump}, following its shape. For the application, a function is more useful as $m$ gets smaller, or equivalently when scaled as $F(x) = f(m x)/m$, the useful domain gets larger. The simplest polynomial that satisfies these constraints can be constructed from its deriviative specification, using just two zeros $$f_1'(x) = (1-x) (1+x) = 1 - x^2$$ which leads to $$f_1(x) = x - \frac{x^3}{3}.$$ Here $m_1=2/3$. This can be generalized to higher order bumps by just adding more zeros at $\pm 1$. The family of bumps is given by $$f_n'(x) = (1 - x^2)^n.$$ At first glance, this seems like a useful family, but it is not entirely clear how to implement it efficiently. Integrating the binomial expansion of $f_n'(x)$ is straightforward, but not very elegant. A second way to generate bumps is to make the derivative maximally flat at $0$, using multiple zeros at $0$ for the second derivative, which makes the saturation function maximall straight at $0$. This constraint is satisfied by the functions $$g'_n(x) = 1 - x^{2^n}$$ which have simple zeros at $\pm 1$ besides the factors with complex zeros $1+x^2, \ldots, 1+x^{2^{n-1}}$. While the integral $g_n$ has a simple closed form, it is not useful, as the squash actually increases as the order increases. A third way is by simply squashing multiple times using a simple squash function. It is simpler to express by starting with a scaled $f_1$ such that $-1,0,+1$ are fixed points. A sequence of functions can then be constructed as $$h_1(x) = \frac{1}{2}(3x-x^3) \text{ and } h_{n+1}(x) = h_1(h_n(x)).$$ Due to the scaling, the gain at $0$ increases with $n$ as $(3/2)^n$. To obtain a gain $1$ function, this scaling factor can just be divided out again at the end of the iteration chain. After some doodling and plotting, it looks like the $h_i$ are very similar to the $f_i$, as $h_i$ gives rise to polynomials that are also increasingly flat at $\pm 1$. This can be seen as follows. $f_1$ around the fixed point $1$ looks like $1-(x-1)^2$, which is a double zero after translation to $(1,1)$. Iterating this turns the double zero into a quadruple zero. At each iteration, 2 more zeros are added. However, $h_n$ and $f_n$ are not the same sequence. The order of $h_n$ is $ 3^n$, so that of $h'_n$ is $3^n-1$, which grows in a different way than the order of $f'_n$ which is $4^n$. Outside of our range of interest we have $h_1(2) = -1$, so some flattening will occur at $x=\pm2$, as its surrounding is mapped to a stable fixed point. To avoid these other fixed points with the aim of getting more useful zeros, we could start te iteration scheme with $k_2(x) = c f_2(x)$, which is monotonous as $f'_2(x) = (1-x^2)^2\geq 0$. After integrating, $$f_2(x) = x - \frac{2}{3} x^3 + \frac{1}{5}x^5.$$ This gives $c = f_2(1)^{-1} = 15/8$. The flatness order of the fixed points at $\pm 1$ is $3^n$, while the polynomial order grows as $5^n$. A forth way is using a truncated Taylor expansion of an elementary function. Often, when a saturating analytic function is necessary, $\arctan(x)$ is used, which corresponds to the integral of the bump $(1+x^2)^{-1}$. This doesn't seem to be so useful here as every term needs to be evaluated, i.e. there is no fast $O(log(n))$ evaluation, and the function squashes the real line instead of a compact interval. Entry: Constant-Q pole curves Date: Mon Mar 25 09:41:00 EDT 2013 Type: tex The constant-$Q$ constraint of a $2$nd order CT filter is equivalent to a constant phase constraint on the signal poles. The parameter $Q$ is usually defined from a $2$nd order transfer function, which has a pole polynomial $$s^2 + \frac{\omega}{Q}s + \omega^2 = (s-s_+) (s-s_-).$$ For $Q>\frac{1}{2}$, and using the more conventient parameter $\zeta = 1/2Q$ with $\zeta < 1$, this corresponds to the poles $$s_\pm = \omega \left[ \frac{-1}{2Q} \pm j\sqrt{1 - \frac{1}{4Q^2}} \right] = \omega (-\zeta \pm j \sqrt{1-\zeta}) = \omega \phi.$$\ This has has $\| \phi \| = 1$, so we can write $\phi = \exp(j\theta)$, where $$\theta = \arctan \sqrt{4Q^2-1}.$$ Mapping the idea of constant-$Q$ to the discrete domain can be done using the impulse-invariant transform $z_k=\exp(s_k T)$, where we set $T=1$ for simplicity. Given $\phi$, curves of constant $Q$ are parameterized by $\omega(t)$ which corresponds to the discrete poles $$z_\pm = \exp[\pm\omega(t)\phi].$$ Let's drop the $\pm$ subscript, since it is clear we're dealing with complex conjugate poles. For practical purposes in a digital synth, where control signals usually have a lower bandwith, it is useful to approximate pole curves in a piecewize linear fashion, i.e. to approximate $\omega(t)$ as linear curves over discrete points or $\omega[k] = \omega_0 + \omega_1 k$, which creates the pole curve $$z[k] = \exp [\phi(\omega_0 + \omega_1 k)] = a_0 a_1^k$$ with $a_0 = \exp(\phi \omega_0)$ and $a_1 = \exp(\phi \omega_1)$. The good news here is that a constant-$Q$ path in the discrete domain corresponds to a pole path that can be updated in an efficient way, using successive multiplication by $a_1$. What remains is to find a filter topology with explicit mention of the signal poles. Entry: Constant-Q filter topologies Date: Mon Mar 25 10:54:52 EDT 2013 The question remains, which is more convenient? - A naive SVF + correction for non-linearity - a "correct" pole-position using constant-Q linear interpolation The answer is probably "both" as it depends on what the purpose is, i.e. in what control structure they are embedded. EDIT: 2013-5-7. It's probably best to go for the SVF. Entry: Construct a low-pass filter from a discrete pole pair. Date: Mon Mar 25 11:01:58 EDT 2013 Type: tex Given a set of complex conjugate signal poles $z_\pm$, how to construct a low pass filter from it? Two elements are important: how to derive a low--pass from a real $1$-pole filter and how to construct a real $2$-pole filter from a complex $1$-pole. A real low-pass filter with input $i_k$ and state/output $x_k$, built from a single real pole $z_0$, has the update equation $$ x_{k+1} = z_0 x_k + (1 - z_0) i_k.$$ This can be interpreted as the filter gradually \emph{mixing in} the input signal $i_k$ by \emph{forgetting} part of the state $x_k$ in the update to the next state $x_{k+1}$. When $i_k$ is constant, $x_k$ will start to approach it in the limit $k \to \infty.$ The $z$--transform of the update equation is $zX = z_0 X + (1 - z_0) I$ or $$\frac{X}{I} = \frac{1-z_0}{z-z_0},$$ which clearly shows that the response is $1$ for $z=1$. This can be directly generalized to a filter with a single complex pole, by replacing $z_0$ with one of the complex poles $z_\pm$, e.g. $z_+ = c + j s$. For $i_k$ constant, $x_k$ will eventually approach it. The update equation $x_{k+1} + j y_{k+1} = z_+ (x_k + j y_k) + (1 - z_+) i_k$ written in matrix form for the real and imaginary components is $$ \begin{array}{rcl} \matrix{x_{k+1} \\ y_{k+1}} = \matrix{c & -s \\ s & c} \matrix{x_k \\ y_k} + \matrix{1-c \\ -s} i_k \end{array}.$$ Note that calling this a low--pass filter is stretching the definition a bit. The only thing we constrain is the response at DC, which should be $1$. The next step is to look only at the real part of the state/output sequence $x_k + j y_k$. This does not change the response for DC. The effect however is to sum the output with that of a second filter based on the complex conjucate signal pole $z_-$. A real filter with response $1$ at DC is by definition a low-pass filter. It is as if the matrix equation can be interpreted in two ways: as a complex one--pole with real input and gain 1 at DC, or as a real two--pole low--pass when considering only $x_k$. Note that for most poles of interest, $c$ will be close to $1$, so it might be numerically more interesting to parameterize the implementation in terms of $1-c$. Entry: Exponential curve interpolation Date: Mon Mar 25 13:08:31 EDT 2013 To come back on the exponential curve interpolation, the question is really how much difference is there in computing a_0, a_1 and a_N = a_0*a_1^N from [1]. I.e. it doesn't matter that a_0 and a_N computed from direct exponentials are somewhat off, what is important is that the direct computation of the approximation of a_N is fairly close to a_0*a_1^N, where a_0 and a_1 are also approximations. [1] entry://20130325-094100 Entry: Broad-range, weak approximation to 2^x Date: Tue Mar 26 09:34:01 EDT 2013 Type: tex I need an exponential mapping $2^x$ that is accurate on the large scale but can have some local deviations as long as they are relatively smooth. A canditate is to make it exact for integer powers and then require continuitiy of the derivative. \section{Piecewize} Computing a weak approximation to $2^x$ can be split in two parts: floating point modulo, which splits $x = n + \epsilon$ with $n$ an integer and $\epsilon \in [0,1]$, and computing a polynomial approximation for the range of $\epsilon$. Let's focus on the latter first. A good guess is probably to make sure that function $2^x$ and first derivatives $\log(2) 2^x$ of the polynomial $f(x)$ at $0,1$ agree with $2^x$. It's probably possible to drop one of those constraints and just require that the derivatives match at the stitch points, or $2 f'(0) = f'(1)$. This gives respectively $4$ and $3$ constraints for a $3$rd and $2$nd order polynomial $f_3$ and $f_2$. Solving the constraints for the latter gives $$f_2(x) = 1 + \frac{2}{3}x + \frac{1}{3}x^2.$$ It has $f'(0) = \frac{2}{3}.$ From the plot this seems to be accurate enough for human interfacing, i.e. log pots. The straightforward way to solve the The floating modulo problem is to use a $\tt{float}$ to $\tt{int}$ conversion. Once $n$ is obtained, evaluating $2^n$ can be computed in a loop in $O(\log(n))$ time. \section{Limited Range} The loop in the previous approach can be problematic, e.g. for SIMD evaluation. A different way to construct an approximation is to keep in mind that function iteration can sometimes help to construct higher order polynomials efficiently. Based on the relation $$2^x = (2^{\frac{x}{2^n}})^{2^n},$$ we can take the inner exponentiation to be the approximation $f_2(x)$ from above and use an iteration function $u(x)=x^2$ to get the estimate $$f_n(x) = u^n[f_2(\frac{x}{2^n})],$$ which works for $x \in [0,2^n]$. For the application, $5$ orders of magnitude will suffice, which roughly corresponds to the \emph{audible} frequency range $20$Hz-$20$kHz together with the \emph{haptic} range $0.2$Hz-$20$Hz. It's a nice coincidence that in this range we find $2^{2^4} = 65536$, approximately $10^{4.8}$. The ratio $f_2(x/2^4)^{2^4} / 2^x$ over the range $[0,2^4]$ ranges between $0.97$ and $1.05$, which is pretty good. Entry: Successive squaring for complex exponentials Date: Tue Mar 26 17:35:42 EDT 2013 Type: tex \section{Buildup} The previous post[1] makes one wonder if the same trick works for complex exponentials. Also hinted at in [2]. Note that the idea doesn't change if we change the equation to $$e^x = (e^{\frac{x}{2^n}})^{2^n}.$$ What we're interested in is a pure imaginary exponent $e^{j\omega}$, since the real exponent is already solved in [1], so the unit--norm constraint could be used to perform some more approximations. For the application, the range of interest is $\omega \in [0,\pi]$. Computing the iteration is straightforward. The problem is in the initial estimate. Over the whole range, we'd like to have accuracy concentrated in the lower frequency ranges. This can be established by making the approximation error maximally flat at $0$, correspond to a truncated Taylor series. Also we'd like to err to one side, keeping the magnitude $| f(x) | \leq 1$. For example, with $4$ iterations as before, the range for the inner approximation would be $[0.\frac{\pi}{2^4}]$ which gives the first catch: the reduction in argument range is not double exponential as it is for the real exponential function. I.e. multiplication of complex numbers gives has an \emph{additive} phase effect, as opposed to a multiplicative magnitude effect. But\ldots maybe that is not a problem? Just a factor $16$ might be enough to bring most of the useful frequency range into a precise enough region. Convergence is very fast nearing $0$. \section{Algo} After some plotting, I'm getting very good results for $[-\pi,\pi]$ with second order approximations $$s(x)=x \text{ and } c(x)=1-\frac{x^2}{2}$$ followed by a single N-R normalization step[3], which is a multiplication of $c(x)$ and $c(x)$ by $s(x) = (3 - (c(x)^2 + s(x)^2)) / 2$, followed by two complex squarings compensated with the appropriate $2^2=4$ times input range rediction. This is quite remarkable. The resulting polynomial is of order $24$. % [1] entry://20130326-093401 % [2] entry://20130309-182354 % [3] entry://20130315-205646 Entry: Imaginary logarithm Date: Tue Mar 26 18:17:11 EDT 2013 Type: tex Using halving, followed by re-normalization, it is possible to construct a $2^n$th root of a unit norm complex number, which when taken far enough allows the computation of a logarithm. Using a single N-R step normalization, this seems to work fairly well for $\theta < \pi/4$. Entry: Exponentials: drift compensation. Date: Wed Mar 27 00:36:36 EDT 2013 Type: tex What I missed last time is that the angle between $p_0 = e^{j\phi_0}$ and $p_1 = e^{j\phi_1}$ can be computed from $p_{10} = p_1 \bar{p_0}$. The division is just multiplication with the complex conjugate since the norm is $1$, and since the value is very close to $1$, the angle can just be read as the imaginary part of $p_{01}$. This cheap logarithm can be used for the computation of phasor paths. The successive approximation algorithm from last post. Entry: Phasor path Date: Wed Mar 27 00:44:59 EDT 2013 Init: p_0 Desired at end of period: f_1 p_1 = ^exp(-j f_1) # approx, but make sure unit norm p_01 = p_1 * p_0' # complex conjugate, to get angle difference p_01^(1/(2^n)) # p_01 is expected to be small, so this should be fairly accurate Computing the log can be split into 2 parts: first part with normalization, second part as division. Eventually, the algorithm will converge, as long as there is no positive feedback. I.e. we should *underestimate* the magnitude of p_01^(1/(2^n)). The approximation of the complex exponential can a simple second order approximation, together with normalization and successive squaring. If accurate frequencies are necessary, they could be computed using another feedback loop, i.e. a PLL. A second level is possible here too: since the normalization step is part of the computation, it will might be added to the compensation. I.e. given an approximation of exp(-jw) that produces unit norm estimates, construct a map of w'->w to compensate for the phase non-linearity. Entry: Rational squash functions Date: Wed Mar 27 09:05:46 EDT 2013 Type: tex Using the reasonable approximation of $2^x$ from [1], and making the range symmetric around $0$, e.g. by starting the iteration using $g_2(x) = f_2(x+1/2)$, it is possible to construct a squash function defined on a limited range [2] from the function $$s(x) = \frac{2^x - 1}{2^x + 1},$$ with $s'(0) = \frac{1}{2}\log(2)$. However, $s(x)$ reaches $f(x')=0$ only at $\pm \infty$, so not much can be said about the derivative using the approximation of $2^x$. I got this function from [3], noticing that the plot of $R_1$ in yellow looked odd, as a rational function cannot have a different limit to $+\infty$ and $-\infty$. Then I realized the $x$ axis is logarithmic, with the left side of the graph approaching $0$ from above. This corresponds to the function $s(x)$. EDIT 2013-5-7: Actually, this is the Hyperbolic Tangent function. % Another approach is to use the N-R approach of the normalization function? % [1] entry://20130326-093401 % [2] entry://20130321-160005 % [3] http://en.wikipedia.org/wiki/Chebyshev_rational_function Entry: ... Date: Thu Mar 28 00:42:35 EDT 2013 Type: tex Direct synthesis of waveform integrals followed by differentiation, as can be done for a sawtooth wave by synthesizing piecewize parabola, seems to work well for limiting the aliasing. Some questions. Does this work for a square wave, by differentiating a triangle? And, is it possible to modify the technique to implement anti--aliased static nonlinear saturation, which for high gain factors suffers from similar problems. I don't see how to directly synthesize the time integral of such a nonlinearity. For a non-linearity $\sigma(x)$, operating on a sequence $y_n = \sigma(x_n)$, this would boil down to generating the integral directly, or $$z_{n+1} = y_n + z_n = \sigma(x_n) + \sigma(x_{n-1}) + \ldots$$ which doesn't seem to make much sense. Entry: Static nonlinearities for discrete systems. Date: Thu Mar 28 01:04:34 EDT 2013 Type: tex The effect of a static nonlinearity $\sigma$ on a signal $x_n$ should be seen as the anti--aliased sampling of $\sigma(x(t))$ where $x(t)$ is the reconstruction of $x_n$. Typically this is done using oversampling. A naive example with a linear interpolating oversampling step, and a $\frac{1}{4}[1,2,1]$ subsampling mask is $$\frac{1}{4} [\sigma(x_{n-1}) + 2 \sigma(\frac{x_{n-1} + x_n}{2}) + \sigma(x_n)].$$ Isn't it better to design this directly as a \emph{dynamic} nonlinearity, or as a static nonlinearity that takes $2$ inputs but is black box such that some internal sharing might be exposed? From listening, I have a vague suspicion that this is only important for hard--corner nonlinearities, for which the integral might actually be computable in a piecewize fashion using curve intersection. Intuitively, what happens is that when the signal makes big, square--wave jumps, the end points are actually very important, as this is where phase differences create amplitude-modulated pulse-like distortions, which are quite annoying. Entry: Iteration and polynomial interpolation Date: Sun Mar 31 16:18:59 EDT 2013 Thinking about recent ad--hoc experiments with polynomial iteration. However, there is a classical poly approx method that does successive refinement, adding one approximation point at a time. This would be Newton interpolation. It's not really what I had in mind though.. Overview from "Inleiding tot de numerieke wiskunde" - Adhemar Bultheel. - Linear equation - Vandermonde matrix - Lagange: sum of base polynomials f_n at x=x_n and 0 at x=x_i, i!=n - Newton: successive approximation, one data point per step + specialization for equidistant, ... - Hermite: also derivatives Entry: Stilson's PhD meta Date: Mon Apr 1 01:26:57 EDT 2013 I started reading Stilson's PhD theses again a couple of days ago. An interesting book, with a good general smell. I first noticed the finished PhD here[1], a little over two years ago. Some references are the 4-multiply equalizer[3] and the Sawtooth hack[4]. An idea that seems to be ripe for me to explore again are several ways of looking at different CT/DT relations. Also mentioned here[2]. The question I have now is: what is the point? Why bother using the Time-scale calculus[5]? Some other things: - Weighted least squares for polynomial-parameterized FIR filters [1] entry://20110104-045522 [2] entry://20110104-110954 [3] entry://20110105-070637 [4] entry://20110104-153108 [5] http://en.wikipedia.org/wiki/Time_scale_calculus Entry: Root locus Date: Mon Apr 1 23:53:39 EDT 2013 Just a random thought: computing root locus paths directly is difficult since it requires the factorization of a polynomial at each point. However, when a single root is known, working the other way around might be a lot more trivial: compute how coefficients move when we nudge the poles, then integrate this according to the desired direction of the coefficients. This could probably be used in some kind of parameter optimization procedure! Compute function (functional) of pole path, then solve or minimize. [1] http://en.wikipedia.org/wiki/Inverse_functions_and_differentiation Entry: Autodiff partial differentials Date: Tue Apr 2 00:09:36 EDT 2013 Following Dan's Haskell implementation, it might be better to define the autodiff in RAI in a similar way, by allowing a list of infinitesimals instead of just one, to allow evaluation of partial derivatives. [1] http://blog.sigfpe.com/2006/09/practical-synthetic-differential.html Entry: Natural gravitation points Date: Tue Apr 2 00:29:10 EDT 2013 - autodiff, smooth infinitesimal analysis & synthetic differential geometry - geometric algebra - conservative, chaotic dynamics - difference calculus % [1] entry://20130320-191804 Entry: Backwards vs. Forward difference Date: Mon Apr 8 00:59:53 EDT 2013 Type: tex Two naive approximations for discretizising the differential equation $$ \dot{y} = x$$ corresponding to the Laplace domain equation $s y = x$, representing the integrator $y=s^{-1}x$, are the \emph{backward difference} approximation $$y[k] - y[k-1] = x[k]$$ corresponding to the Z--transform representation $(1-z^{-1})y = x$ and the \emph{forward difference} approximation $$y[k+1] - y[k] = x[k]$$ corresponding to $(z-1)y = x$. From the sign in $k\pm 1$ it is clear where the naming scheme comes from. Respectively, these discrete approximations to differential equations correspond to the backward and forward transforms $$T_b(s) = 1-z^{-1}$$ and $$T_f(s) = z-1,$$ which relate the $s$ and $z$ domains in two different ways. Following directly from the transform mappings and normalization to causal form [3], the discrete integrators corresponding to the continuous integrator $i(s) = s^{-1}$ for the backward and forward transforms are $$i_b(z) = \frac{1}{1-z^{-1}}$$ and $$i_f(z) = \frac{z^{-1}}{1-z^{-1}} = z^{-1} i_b(z).$$ The backward difference has a delay--free path so can not always be used in \emph{structural} transformation, i.e. the direct replacement of integrators in analog filter topologies. The forward transform has a decoupling delay. The forward transform is the basis of the calculus of finite differences [2][4]. % [1] https://ccrma.stanford.edu/~stilti/papers/Welcome.html % [2] entry://20110104-110954 % [3] entry://20130415-172439 % [4] http://en.wikipedia.org/wiki/Finite_difference#Calculus_of_finite_differences Entry: Circle filter Date: Mon Apr 15 11:10:56 EDT 2013 Type: tex From stilson's PhD, I'm a bit surprised to find a constant bandwidth filter with circles as root loci for the frequency component\footnote{See [3].}. However it does not have straight lines for the bandwidth parameters. This seems odd, let's have a look at it. The pole equation of the forward $(z-r)^{-1}$ and backward $z(z-r)^{-1}$ leaky integrators in series, fed back through a $k$ gain is $$(z-r)^2 + zk = 0.$$ The coefficient of $z^2$ is $1$, which means the constant term $r^2$ is the product of the roots. This product is independent of $k$, so the root locus for $r$ constant is indeed a circle. This analysis makes it clear why we have both a forward \emph{and} backward integrator. The presence of the extra delay keeps the $k$ coefficient away from the quadratic and constant terms, which determine the product of the poles and thus their magnitude in case they are complex conjugate. % [1] https://ccrma.stanford.edu/~stilti/papers/Welcome.html % [2] md5://86f4ad7cfe947186e03a5d0c20ad3e65 % [3] entry://20130421-101744 Entry: Realizable transfer functions Date: Mon Apr 15 17:24:39 EDT 2013 Type: tex Instead of writing a filter transfer function as a quotient of two polynomials in $z$ like $$\frac{1}{z-z_0}$$ in most DSP literature one writes $$\frac{z^{-1}}{1-z_0z^{-1}}$$ to make it more clear what the \emph{realization} of the filter looks like in terms of implementable delays $z^{-1}$, as opposed to non--implementable predictors $z$. The form $F(z) = F_i(z^{-1}) / [1 - z^{-1} F_o(z^{-1})]$ with $F_i$, $F_o$ polynomials, is a representation of a \emph{direct form} realization of a causal IIR filter[1]. The negative exponents ultimately come from the definition of the Z--transform\footnote{ When working with discrete transfer functions, I find the use of $z$ as forward delay a little annoying. I've seen literature where the sign of the exponent in the $z$--transform is reversed, to yield easier to read formulas. Basically, we're dealing with \emph{causal} filters $99 \%$ of the time, so why not take the convention that causal sequences correspond to positive powers of $z$, so we can get rid of all those $.^{-1}$? Even though I've been bugged by this for a while[2], for the sake of clarity I'll stick to the convention found in most of the DSP literature as this seems to be quite firmly established. The point of this post is to lift my confusion about the conventional notation. } of a sequence $a_n$ as $$A(z) = \sum_{n = -\infty}^{\infty} a_n z^{-n},$$ which for a causal impulse response has only negative powers in $z$. % [1] http://en.wikipedia.org/wiki/Digital_filter#Direct_Form_II % [2] entry://20110111-155812 Entry: Bultheel - Linear Prediction Date: Mon Apr 15 20:08:20 EDT 2013 Type: tex This[1][2] is one of my old-time favorites. It has a very wide perspective on the problem of linear prediction. % [1] ftp://129.132.148.131/hg/EMIS/journals/BMMSS/Bulletin/bul941/BULTHEEL.PDF % [2] md5://dc1a061d4aaef16a7b53bf2753f10905 Entry: Ladder vs. Lattice? Date: Tue Apr 16 02:02:46 EDT 2013 EDIT: See later this week. Lattice has to do with AR modeling. What is a (normalized) ladder filter? It shows up here[1]. Just staring at some pictures, something tells me the lattice vs. ladder distinction has to do with hyperbolic vs. orthogonal rotations... EDIT: Yes +-. A ladder filter can be derived directly from a waveguide, while a lattice filter is derived from the Levinson-Durbin algorithm for linear prediction. Both arrive at +- the same point, with the former being engergy-normalized over the sections and the other one not. They have the same transfer function for the allpass. Moonen's course notes might have some more info[2]. From Bultheel's linear prediction article[4] it is clear that the hyperbolic form comes from moving from an energy equation for traveling waves moving into opposite directions specified at different points in space, to a form that is parameterized in space only. Basically, these are equations that can be interpreted in two directions. Ladder is phase shifter: Given an allpass filter $h_n(z)$, adding an orthogonal ladder junction (wihtout delay) parameterized by angle $\alpha$ will give a new allpass filter $$h_{n+1}(z) = z^{-1} \frac{s + h_n}{1 + s h_n(z)}.$$ with $s = \sin \alpha.$ I.e. given a phase shaper, it will shape it some more. Without the delay this doesn't do much: such operations form a monoid (actually a group). However, adding the delay will add a full phase wrap on each section. ?? Now the relation between ladder and lattice is to observe that the recursion relation above can also be implemented more directly ?? [1] http://thesounddesign.com/MIO/EQ-Coefficients.pdf [2] http://homes.esat.kuleuven.be/~moonen/asp_course.html [3] https://ccrma.stanford.edu/~jos/pasp/Conventional_Ladder_Filters.html [4] ftp://129.132.148.131/hg/EMIS/journals/BMMSS/Bulletin/bul941/BULTHEEL.PDF Entry: Equalizers Date: Tue Apr 16 02:23:43 EDT 2013 Interesting read[1]. It mentions the Massie paper also referenced in Stilson's PhD, which implements an EQ section using an allpass filter that is mixed with the original signal for boost or cut. [1] http://thesounddesign.com/MIO/EQ-Coefficients.pdf Entry: Waveguide, Ladder, Lattice Date: Wed Apr 17 16:22:20 EDT 2013 Type: tex This derivation tries to make sense of discrete waveguide, ladder and lattice filters, starting from the ladder filter as an intuitive physical model. Since the terminology is not completely clear to me, I'll add ``FIR'' and ``IIR'' to disambiguate the direction of the signal flow. The waveguide is a direct model of left and right traveling waves combined with scattering junctions. We will show that a waveguide model with $z^{-\frac{1}{2}}$ delay sections has the same transfer function as a ladder filter, that a ladder filter is an allpass filter, that a ladder section can be related to a lattice section by inverting one of the input/output relations, and that the input/output relation of a feedforward lattice filter can be computed based on this relation. An important observation is to separate the concepts of filter realization (implementation block diagram), from abstract filter transform domain equations. Essentially, an equation does not have a direction, while a filter realization is a directional update equation directly related to a dataflow network. As a notational device, transform domain equations or transfer functions in a specific form are used to hint at a specific realization. The main trick in the derivation below is to perform algebraic manipulations on equations in the $z$ domain, and then relate them to directional update equations in the discrete time $k$ domain. A \emph{wave junction} is defined as an equation $$(r_o, l_o) = J (r_i, l_i),$$ relating four entities: incoming right and left traveling waves $r_i$, $l_i$, and outgoing right and left traveling waves $r_o$ and $l_o$. For a lossless junction, the linear 2--dimensional operator $j$ is required to be be energy--preserving (orthogonal). $J$ can be represented by the matrix $$J_\theta = \matrix{\cos\theta & -\sin\theta \\ \sin\theta & \cos\thetac}.$$ We'll use the abbreviation $c_n=\cos\theta_n$ and $s_n=\sin\theta_n$. In a wave guide model, this 4--port relation is applied multiple times, connecting together different sections that are assumed to be spatially separated such that left and right wave travel time between sections corresponds to half the unit delay. We define junction $n$ to connect sections $n$ and $n+1$, where the rightmost section $0$ is a termination 2--port. The junction equation at $n$ is then $$\matrix{r_n \\ l_{n+1}} = z^{-\frac{1}{2}} J_{\theta_n} \matrix{r_{n+1} \\ l_n}.$$ A change of time reference for the right side of each junction, expressed by the substitution $r'_{n+1} = z^{-\frac{1}{2}} r_{n+1}$ and $l'_{n+1} = z^{-\frac{1}{2}} l_{n+1}$ allows the symmetric half--unit delay to be absorbed into a full delay, yielding a ladder filter topology. This is equivalent and possibly easier to understand by drawing the diagrams and pushing each rightward half--delay through the right side network until it shows up to combine with the leftward half--delay. The transformed equation is then $$\matrix{r_n \\ l'_{n+1}} = J_{\theta_n} \matrix{r'_{n+1} \\ z^{-1} l_n},$$ which can be performed at each junction, shifting the time base of each section by $1/2$. Below we work only in this time--shifted coordinate system, dropping the primes in the notation. To summarize, a symmetric waveguide junction is equivalent (isomorphic) to an asymmetric ladder filter junction. The difference is one of time coordinates of nodes $r_n$ and $l_n$. This yields two advantages. The ladder filter nodes $r_n$ are not state variables, but instantaneous computation nodes, so they do not require memory when a filter is realized. These computation nodes can be reversed without introducing un--implementable inverse delays. This allows the lattice filter to be related to a pair of FIR filters. So a ladder filter topology is more useful in practice, but a waveguide filter might be simpler to relate to a more intuitive physical structure. From this point on our equations talk only about the ladder filter \emph{structure}. However, we will be able to use the waveguide \emph{analogy} due to the isomorphism. A ladder filter needs to be terminated at one end, effectively turning a collection of 4--port relations (the energy--preserving junctions + delays) into a single 2--port relation, i.e. a filter. We'll terminate it at juction $0$ with an abstract transfer function $h_0(z)$ or $l_0(z) = h_0(z) r_0(z)$, meaning the righ--traveling wave $r_0$ gets shaped by $h_0$ to become the left--traveling wave $l_0$. This allows us to write a recursion relation for the transfer function of each section by succesively applying $J_{\theta_n}$ to yield a nesting of transfer functions $$\begin{array}{ccc} h_{n+1}(z) = f_n(z^{-1} h_n(z)), & f_n(x) = \frac{s_n + x}{1 + s_n x}, & s_n = \sin \theta_n. \end{array}$$ Note that the transfer function does not depend on $\cos\theta_n$, which is important later when we relate it to lattice filters. Because $f_n$ maps the unit circle to itself, if $h_n$ is an allpass filter, $h_{n+1}$ will also be allpass. Note there is some freedom in where to place the $z^{-1}$ unit delay. The alternative is to perform $f_n$ first and multiple by $z^{-1}$ afterwards\footnote{If there is no delay, the result of the reucursion is a composition of moebius transforms, which form a group.}. To relate this recursion relation to FIR lattice filters, the equations need to be re--arranged to $$(r_{n+1}, l_{n+1}) = \frac{1}{c} J'_\theta (r_n, z^{-1} l_n).$$ with $$J' = \matrix{1 & -s_n \\ -s_n & 1}.$$ This is an algebraic transformation. Adding a \emph{directional} interpretation to this relates the ladder structure to the FIR lattice structure. We set the input of the FIR lattice filter to be $l_0 = r_0$. This gives two output nodes $l_N$ and $r_N$ for $N$ sections. Since the lattice structure is feedforward, both outputs $f_l = l_N / l_0$ and $f_r = r_N / r_0$ need to be FIR filters, i.e. polynomials. An interesting point here is that allpass transfer characteristic $$\frac{r_N(z)}{l_N(z)}$$ of the ladder filter can be used to obtain the transfer function of both FIR outputs. It is just a relation between nodes, and we are still working with the same equations! The only thing that changed is the interpretation of the equations as a different realization. What this means is that $f_r(z) / f_l(z)$ is allpass. From that we have $f_r(z) = f_l(z^{-1})$. These are sometimes called the. the forward and backward predictors in the context of \emph{linear prediction}. \section{Conclusions} The allpass--terminated digital waveguide with energy--preserving junctions and half--unit section delay implements the same allpass transfer function as a ladder filter. The forward lattice filter is a re--interpretation of the ladder filter equations, with one of the signal directions reversed. Entry: Inverse Lattice Date: Wed Apr 17 20:58:19 EDT 2013 So, what is the inverse lattice in [1]? It's related, but not the same as the normalized ladder. It appears it is not normalized, but cosines do show up in one direction due to the multiple paths. EDIT: Cosines show up because the junctions are still energy-perserving (unit norm eigenvalues) but no longer orthogonal. A non--normalized lattice allpass implements the same transfer frunction as a normalized ladder, but will have undergone a coordinate transformation for the state variables, i.e. the signals traveling through the sections and delays. It seems that the inverse lattice is really just the lattice filter with the delay--free chain reversed. There is no 1/c normalization, but this only has an effect on the allpole output, not the allpass, as in the forward lattice, both forward and backward predictors are scaled in the same way, and they serve as numerator and denominator of the allpass. The important thing to realize is that the equations harbor several filter realizations: forward/backward FIR, reconstructing allpole, allpass, both with and without the 1/c normalization. % [1] ftp://129.132.148.131/hg/EMIS/journals/BMMSS/Bulletin/bul941/BULTHEEL.PDF Entry: Lattice, Ladder and Waveguide Date: Thu Apr 18 13:36:56 EDT 2013 Since all these concepts are related, it is hard to get subjective and pick out the most important concept. We might choose the central idea to be the construction of a family of allpass filters from a single recursion relation that corresponds directly to the structural operation of adding a left/right signal flow section to an existing allpole filter. The normalized ladder filter can be interpreted as a time--shifted version of a lossless digital waveguide. This analogy can be used to add a physical interpretation to the resulting structure. Once an allpass filter is obtained from the iterative construction process, the allpole filter corresponding to the denominator of the allpass is accessible directly in the filter structure, and the two FIR filters that make up the numerator and denominator of the allpass are accessible in the time--reversed ladder filter. From the time--reversed normalized ladder filter, the lattice filter can be constructed by dropping the normalizing multiplication between different sections. This structure is the same as the one can be derived from the Levinson-Durbin algorithm for solving the Yule-Walker equations occuring in the problems of autoregressive modeling and linear prediction. To realize an arbitrary minimim phase FIR, allpole or stable allpass, any of the structures can be used. As a remaining question, why would one use a normalized ladder filter instead of an IIR lattice filter (non-normalized ladder) in the construction of an allpass filter? Entry: Levinson-Durbin algorithm Date: Fri Apr 19 13:14:35 EDT 2013 The important part is the definition of the partial correlation coefficients as the residu--normalized prediction error of the next autocorrelation coefficient. Entry: Transposition of realizations Date: Fri Apr 19 13:15:58 EDT 2013 To compute the transposition of a filter realization: - Reverse arrows - Exchange summation points "+" with contact points "o" - Swap input and output ( Note that a summation point with fanout is actually 2 nodes: a summuation node connected to a contact point node. ) What is the algebraic counterpart to this? (see below) A way to prove this is probably to first devise a way to construct arbitrary networks from primitives, prove it for the primitives and prove that the construction rules leave the property intact. The property is satisfied by these primitives: - unit delay - multiplicative scaling The property is preserved by these compositions: - serial composition (A B) - parallel composition (o A/B +) - feedback composition (+ A/B' o) ( There is probably a link to the Haskell ArrowLoop rules! ) Creating loops is like parallel composition, only one of the two networks is reversed, the input is a summation node and the output is a contact point. From trying some examples, it seems that this is a complete set of operations to construct an arbitrary network from primitives. Given an arbitrary connected graph, the following algorithm can be used to construct it in terms of the primitive operations. 1. Pick a primitive edge in the graph to start the construction 2. For all primitive edges between the 2 nodes, apply parallel or feedback composition to construct a subgraph. 3. If all nodes are covered, we're done. If not, continue. 4. Pick a primitive edge that connects the current subgraph to a node outside of the subgraph, using serial composition. 5. Apply parallel or feedback composition for all nodes in the subgraph that connect to the newly added node. Goto 5. While it follows from construction, still, this transposition business is rather remarkable. It's not visible in "flat" algebraic form of a transfer function, since there is no sharing. I wonder if it is visible in state-space form. Yes. It follows directly from it! The transfer function of a standard form state space model is C (Iz - A)^{-1} B + D. For a SISO system this is scalar value, so taking the transpose of it doesn't change its value. This yields B^T (Iz - A^T)^{-1} C^T + D^T, which corresponds to the transpose of the system matrix. Transposing the system matrix corresponds to performing the network transposition operation. Entry: All-pole filter designer GUI Date: Fri Apr 19 14:17:26 EDT 2013 Given: some graphical represenation of a frequency response, e.g. displayed as a log-log plot in a GUI. In order for the user to be able to incrementally edit the graph, two parameters should be settable: a gain increment at a specific frequency (e.g. mouse drag point) and the bandwidth of the gain increment. Suppose that the "smoothing" operation can somehow be defined. Given a power spectrum, it can be mapped to an all-pole model using the Schur algorithm after converting the power spectrum to an autocorrelation sequence. When the resulting AR model is plotted again as a power spectrum, this by itself is a smoothing operation. What about the following: allow the user to pick the filter order. This will set the "smoothing amount". Then tugging at different places is just changing 1 frequency bin. The effect of this on the autocorrelation can be computed directly by adding a sinusoidal component, so an IFFT step is actually not necessary. Instead of adding a pure sine, which might be too "sharp", it might be possible to update the autocorrelation directly with the autocorrelation of a damped exponential. Given an N (even) order model, the first N/2 damped exponentials added will be matched exactly. Impulse at DC is overall gain, not a parametric EQ. However, a notch filter can be designed as the inverse of a peak filter. Additionally, it might be interesting to look at interpolation between different transfer functions by setting linear interpolation of reflection coefficients, or different kinds of non-linear interpolation where some reflection coefficients are interpolated at a different rate. Entry: IRT - impulse response truncation & Gibbs Date: Fri Apr 19 20:16:15 EDT 2013 There the Dirichlet function shows up (not the sinc function!), which is the in-phase sum of N complex exponentials. A.k.a. discrete summation formula. The integral of the Dirichlet function comes back in the explanation of the Gibbs phenomenon for IRT filters. Due to this, IRT filters can never exceed a certain pass/stop-band spec. Entry: White minimal amplitude, periodic Date: Fri Apr 19 20:24:12 EDT 2013 What does a periodic signal with unit magnitude spectrum but random phase look like? How to make this signal minimal amplitude, i.e. how to pump energy into a system while keeping the amplitude minimal and the spectrum white. ( I found some links w.r.t radar pulses. These look like chirps. ) Entry: Efficiency tradeoff for block-based FFT FIR implementation Date: Fri Apr 19 21:03:49 EDT 2013 What is the optimal overlap / size for FIR implementation? Entry: Stilson's Root Locus Adventure Date: Sat Apr 20 12:48:59 EDT 2013 I can see how it gives some insight. Building linear and circular by doctoring root locus equations does look like an interesting approach. However, the subject itself seems rather unstructured, i.e. a lot of ad-hoc stuff, and additionally, the idea of "constant-Q" controls seem quite unnatural for discrete time filters. Entry: Logarithmic oversampling Date: Sat Apr 20 13:04:54 EDT 2013 Type: tex One way of approximating analog filters is to use \emph{structural} transformations that keep the analog domain's \emph{separation of controls} property intact around $z=1$, but will degrade quickly when the poles move to higher frequencies. One can work against this by simply using oversampling. However, oversampling is linear in complexity, meaning that for oversampling rate $N$, a feedback matrix $A$ would be multiplied $N$ times. It might be possible to compute $A^N$ more efficiently using successive squaring when $N=2^n$. However, this works only for low system orders or for structured matrices, since dense matrix multiplication complexity is $O(n^3)$. E.g. it works beatifully for orthogonal $2$D matrices since there are only two parameters to keep track of, but that might be the limit. Alternatively, we can compute $r(c+is)$ for each pole pare approximately from the low frequency parameter equation. Entry: SVF and Symplectic Update Date: Sun Apr 21 10:17:44 EDT 2013 Type: tex Something I read a while ago, about constructing an approximation to a conservative system by performing the difference equation update in a \emph{ping--pong} fashion, i.e. using speed to update position and then the new position to update speed, instead of performing the updates in parallel. Supposedly, this preserves the \emph{symplectic structure}. I lost the reference, so this is an attempt to reconstruct it. This might be related with approximations to 2nd order CT filters mentioned in Stilson's PhD thesis[1], where two integrators are replaced with a backward and forward difference each. A similar thing happens here: the update equations are no longer parallel, since $x[k+1]$ (first integrator, BP output) is used in the computation of $y[k+1]$ (second integrator, LP output). It's probably best to first define what \emph{symplectic structure} means, and then to see how it can be preserved in the analog to digital conversion. What I remember is that it is important to be able to factor the update into two successive triangular updates. This is exactly what happens in the mixed FW/BW SVF discretization. At full resonance $k=0$ (we're talking about lossless systems only) in the update $$ \begin{array}{ccl} z x_1 &=& x_1 - a (k x_1 + x_2 - i) \\ z x_2 &=& x_2 + a z x_1 \\ \end{array},$$ we have two consecutive operations which hide behind the presence of $z x_1$ in the equation (as opposed to $x_1)$. This succession of updates can be represented by two triangular matrices. $$ \begin{array}{ccc} A = L U, & L = \matrix{1 & 0 \\ a & 1}, & U = \matrix{1 &-a \\ 0 & 1} \end{array}.$$ Note that this matrix is not orthogonal or $A^TA \neq I$. However, it is symplectic. A 2D discrete system with feedback matrix $A$ is symplectic if $A^T J A = J$ with $$J=\matrix{0 & 1 \\ -1 & 0}.$$ This property is satisfied by the conservative forward/backward SVF. Note that $J^2 = -I$. Because $\det A = 1$ and $A \in R^{2 \times 2}$, if the poles are complex conjugate they have to be magnitude $1$ so the system is still stable. So, is a symplectic system always stable? In general, a $A \in R^{2N \times 2N}$ is symplectic if $A^T J_N A = J$ where $$J_N = \matrix{0 & I \\ -I_N & 0}$$ where $I$ is the $N \times N$ identity matrix. From this we have $\det A = 1$. So it is conservative in that state energy is preserved, but that doesn't mean the individual 2D subsystems should be energy--preserving. In [2], p282 it is explained how for each eigenvalue $\lambda$, the inverse $\lambda^{-1}$ is also an eigenvalue. An example of a 4D symplectic matrix can be constructed from a 2D symplectic matrix $A_1^T J_1 A_1 = J_1$ by constructing the matrix $$A'_2 = \matrix{ r A_1 & 0 \\ 0 & r^{-1} A_1 }.$$ Straightforward computation shows that $A'_2^T J'_2 A'_2 = J'_2$ where $$J'_2 = \matrix{0 & J_1 \\ -J_1 & 0},$$ which is a permutation of $J_2$, meaning $J'_2 = P^T J_2 P$. From this follows that $A_2 = P^T A'_2 P$ is symplectic. Since $P$ is orthogonal, $A_2$ and $A'_2$ have the same eigenvalues. The eigenvalues of $A_1$ are unit norm complex conjucate which makes the eigenvalues of $A_2$ complex conjugate pairs of magnitude $r$ and $r^{-1}$ respectively. % [1] https://ccrma.stanford.edu/~stilti/papers/TimStilsonPhDThesis2006.pdf % [2] http://mitpress.mit.edu/sites/default/files/titles/content/sicm/book.html Entry: Determinant 1 Date: Sun Apr 21 11:46:27 EDT 2013 Type: tex I've been running into the matrix $$A=\matrix{1 & -a \\ a & 1-a^2} = \matrix{1 & -\sin\theta \\ \sin\theta & \cos^2\theta}$$ in several places. It has a determinant of $1$\footnote{As an aside, I've been making the error setting $\| A x \|$ equal to $\| x \| \det A$. Norm is only preserved for orthogonal matrices. Given the similarity transform $A = S^{-1} \Lambda S$, it does follow that if $\Lambda$ is orthogonal then it preserves norm. Feeding an arbitrary value to $A^n$ will produce some norm distortion, but it will not "blow up" if $n$ becomes large.}. For a real matrix $A$, if $\det A = 1$ it doesn't necessarily mean that $A$ is orthogonal, or $A^T A = 1$. However, if $A \in R^{2 \times 2}$ with complex eigenvalues, there aren't enough degrees of freedom for the eigenvalues to be anything else than complex conjugate with magnitude 1. This means the system with feedback matrix $A$ is stable. The poles are $$c' \pm \sqrt{1-c'^2}, \text{ with } c' = \frac{1 + c^2}{2}.$$ It might be interesting to look at the eigenvectors too. Entry: SVF decay Date: Sun Apr 21 14:01:42 EDT 2013 How to separate out the SVF decay parameter? The symplectic form of the SVF discretization has the nice property that the stability is preserved, however this only says something about conservative systems. How to separate out the non-conservative part? ( Does this actually make sense? ) Entry: Symplectic update. What to learn? Date: Sun Apr 21 14:09:14 EDT 2013 Type: tex The important idea is that the ping-pong update can transform a CT conservative system into a DT conservative system. For the 2D system, this gives us a simple way to synthesize a stable oscillation using only two multiplications. For higher degrees, it doesn't seem all that useful, since such a system is just a product of independent 2D systems. It seems that the usefulness of this approach would shine more for \emph{nonlinear} conservative systems, which can be unstable in local linear approximation, but always conservative such that an unstabe pole with magnitude $r$ is always compensated by an unstable pole with magnitude $r^{-1}$. Also, for systems that do not have a direct correspondence to physical position/velocity models, such as the Moog 4--pole, the connection probably isn't there. Entry: Symplectic integrator Date: Wed Apr 24 16:33:00 EDT 2013 Once you find the right Google term... [1] http://www.av8n.com/physics/symplectic-integrator.htm [2] http://en.wikipedia.org/wiki/Symplectic_integrator [3] http://en.wikipedia.org/wiki/Symplectic_Euler_method Entry: Equalizers Date: Wed Apr 24 17:07:54 EDT 2013 Peak/notch: 2nd order allpass 0 -> 180 -> 360 Shelving: 1st order allpass? 0 -> 180 Entry: All-pole Date: Mon Apr 29 23:10:22 EDT 2013 There is something I do not understand about allpole filters. Somehow it seems that the most important property of an all-pole is that it is the inverse of an all-zero. An all-pole response is still a sum of complex exponentials, but somehow they do not cancel out. How exactly is this manifested? Entry: Stretched exponentials: scaling attack & decay times Date: Fri May 3 14:44:43 EDT 2013 Type: tex Basic idea is to provide a scale that is mostly log--like, but can express the extremities $0$ and $\infinity$. How to map this time scale $T = [0,\infty]$ to a control parameter range $C=[0,1]$ in a meaningful way? A mapping that satisfies the interval boundaries is $$c(t) = \frac{t }{a + t}.$$ This maps $c(0)=0$ and $c(\infty)=1$. The inverse is $$t(c) = \frac{a c}{1-c}.$$ The parameter $a$ can be determined by constraining the middle of the scale $t(1/2) = a$. Another important parameter is how fast the dial will move to $0$ and $\infty$. Both are extremes that are part of $T$, but for say $50\%$ of the $T$ range we'd like to have decay rates that change mostly exponential in $c$. To expose the symmetry in $t(c)$, let's introduce a change of variable $d = 2c -1.$ This gives $$t(d) = a\frac{1+d}{1-d}.$$ This scale is logarithmically symmetric around $a$ or $t(d)/a = (t(-d)/a)^{-1}$, meaning that in the middle range it behaves mostly exponential, while tending to $0$ and $\infty$ in the two extremes\footnote{ This hints at the input--scaled variant $\frac{1+ax}{1-ax}$ being a good approximation for $e^x$, which is the case when $a=\frac{1}{2}$.}. The slope of $t(d)$, relative to $a$ is fixed. At $d=0$, the function approximates $a\exp(2d)$. For mapping meaningful parameters, this might be a bit too flat. Successive squaring of $t_0(d) = (1+d)/(1-d)$ can solve this. For $n$ squarings we have $\exp(2nd)$. The curve then becomes $$t_n(d) = a t_0(d)^{2^n}.$$ In practice it seems that a single squaring works good. This gives a reasonably flat log response for $3$ decades, about $70\%$ of the scale, leaving the rest for the extreme range. If necessary, the extremities can be avoided by prescaling $d$. With one squaring, using $0.99$ limits the output range to about $9$ decades. Mapping this to pole radius requires an extra step. Let's use the natural $1/e$ decay to relate decay time $t$ (measured in samples) to pole $p$ as $p^t = 1/e$ or $$p = \exp(- \frac{1}{t}).$$ This approximation needs to be accurate for $t \gg 1$, and extend correctly to $p=0$ at $t=0$. The first degree Taylor expansion is $1 - 1/t$. Modifying this slightly to give $$p' = 1 - \frac{1}{t+1}$$ yields the wanted behavior at $t=0$ without changing the large $t$ behavior too much. For numerical reasons the update equations will use the positive quantity $$q' = \frac{1}{t+1},$$ where $p' = 1 - q'$. Composing the two mappings gives $$q_n(d) = \frac{1}{1+a(\frac{1+d}{1-d})^n}.$$ where $a$ gives the mid--scale time constant in samples. I'm using $n=2$, but some knob twiddling makes me think that maybe $n=1$ is better. What is important is to get the mid--scale value correct. I.e. what is a prototypical note's attack and decay rate? To find the warping % [1] entry://20130321-160005 Entry: Derivative of log Date: Mon May 6 00:55:38 EDT 2013 Type: tex One of the things I never really understood properly, is $$D \log(x) = \frac{1}{x}.$$ This essentially comes from two properties: derivatives of inverses are related through multiplicative inverses, and the exponential function is an eigenfunction of the derivative operator. The expression of the derivative of the inverse $Df^{-1}$ can be obtained from the chain rule as $$D (f \circ f^{-1}) = [(D f) \circ f^{-1}] D f^{-1}.$$ Since this also expresses the derivative of the identity function, we have $$D f^{-1} = \frac{1}{(D f)\circ f^{-1}}.$$ This relation is obvious for a linear function, e.g. $y=ax \Rightarrow x=a^{-1}y.$ Also, following the composition of the 3 operations $f^{-1}$, $Df$ and $x \to 1/x$ on a plot makes it fairly obvious what is going on is just obtaining the tangent line and expressing it in the proper coordinate system. The expression for $D\log$ follows straight from the rule of the derivative of the inverse of $\exp$. $$D \log(x) = \frac{1}{\exp [ \log (x)] } = \frac{1}{x}.$$ Entry: Gain knob Date: Tue May 7 13:40:14 EDT 2013 Type: tex The stretched exponential can also be used for Gain knobs, where you would want near-exponential behaviour for the high gains, but linear behavior for the low gains, to be able to reach $-\infty$dB at the left. The prototype can be derived from $[(1+x)/(1-x)]^n$, taking only $x \in [-1, a]$, where $a \in [0,1].$ A typical dB control knob has $(-\infty, 0, 15)$. With $g$ the right--half gain ($15\text{dB}\approx 5.6$) the relation between $g$ and $a$ is $$g \frac{1+\frac{a-1}{2}}{1-\frac{a-1}{2}} = \frac{1+a}{1-a}$$ which can be simplified to $$g(a) = \frac{3-a}{1-a}$$ and $$a(g) = \frac{3-g}{1-g}.$$ What's left is normalization, which is straightforward to do. Map desired input range to $[0,a]$ and use $(1+a)/(1-a)$ to normalize the output. It is remarkable that $g(a)$ is a self--inverse or \emph{involution}. Entry: Curious involution Date: Tue May 7 16:18:04 EDT 2013 Type: tex The function $$f(x) = \frac{x - a}{x - 1}$$ is an involution for $a \neq 1$, meaning $f(f(x)) = x$. Can this be put to practical use? Where does it come from? In general, functions $$\frac{ax - b}{cx - d}$$ with $ad - bc \neq 0$ are a closed set of bijections of the extended complex plane called \emph{fractional linear transformations} or \emph{Moebius transforms}[1]. The function $f$ maps $1\to \infty$, $\infty \to 1$,$0 \to a$ and $a \to 0$. This gives $4$ points relations for $f^2$: $1 \to 1$, $\infty\to\infty$, $0\to0$ and $a\to a$. Knowing that $f^2$ is also an FLT leaves $f^2(x) = x$ as the only possible solution. % [1] http://en.wikipedia.org/wiki/M%C3%B6bius_transformation Entry: Spinors Date: Tue May 7 16:57:06 EDT 2013 Maybe it's time to tackle the question in [2]: Why aren't Spinors used in signal processing? [1] http://en.wikipedia.org/wiki/Spinor [2] entry://20100428-224141 Entry: SVF revistited Date: Tue May 7 19:11:15 EDT 2013 Some interesting notes[1]. One is the introduction of a band-limited nonlinearity by a power detector. [1] http://www.cs.washington.edu/education/courses/cse490s/11au/Readings/Digital_Sound_Generation_2.pdf Entry: FDN Reverberation Date: Sun May 12 18:41:58 EDT 2013 [1] http://www.music.miami.edu/programs/mue/mue2003/research/jfrenette/toc.html Entry: Environmental synth Date: Fri May 17 01:26:27 EDT 2013 [1] http://www.charlesverron.com/content/papers/2010_ieee_verron_TASLP_draft.pdf Entry: Simulating reverb tail Date: Wed May 22 10:41:16 EDT 2013 How to simulate the effect of a reverb tail in an oscillator? Entry: Distance in high-dimensional vector spaces Date: Thu Jun 6 14:46:17 EDT 2013 From [2]: ... under a broad set of conditions (much broader than independent and identically distributed dimensions), as dimensionality increases, the distance to the nearest data point approaches the distance to the farthest data point. Is this related to the idea that the volume of a high dimensional sphere[3] is small compared to its bounding cube? I don't see the connection.. The basic problem seems to be that high-dimensional discrete-dimensional spaces are "very connected" in a way that is not immediately clear on an intuitive sense. Our usual 1,2,3-dimensional view is very particular, and doesn't generalize to larger numbers. [1] http://spin.atomicobject.com/2013/05/06/k-nearest-neighbor-racket/ [2] http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.31.1422 [3] http://en.wikipedia.org/wiki/N-sphere#n-ball Entry: Potential wells / nondeterministic chaos Date: Fri Sep 27 18:57:24 EDT 2013 Type: tex In a 2D potential well (e.g. a ``real'' well), does a system with irregular bumps cause chaotic behavior? I would think so, because there can be directional divergence, which leads to amplification of difference in initial conditions. How to make this explicit? Though I would expect there to be symmetry in solutions if there is symmetry in the potetential well. Looks like this would be about breaking of such symmetry. Entry: Making things with maths - Steven Wittens Date: Wed Nov 6 14:08:11 EST 2013 - CSS 3D - js1k demo[2] - navier stokes parallellized on GPU - how does he do the animated presentations? (Brett Victor's + mathbox.js[5]) - glsl shader curves from projective numbers & rational functions - nice visual demo of quadratic/cubic bezier curve - folding space / hybrid fractals - zoidberg curve[3] - stacked value noise (spatial 1/f^n noise) - verlet integration (preservation of the symplectic form on phase space) - mandelbulber[4] - GLSL sandbox [1] http://www.youtube.com/watch?v=Zkx1aKv2z8o [2] http://js1k.com/ [3] http://www.wolframalpha.com/input/?i=zoidberg+curve [4] http://mandelbulber.com [5] http://acko.net/blog/making-mathbox/ Entry: Power-efficient variable-rate signal processing Date: Fri Dec 13 14:46:42 EST 2013 Say with a chip like the GA144. Does it make sense to write DSP algorithms that gracefully degrade with "silence"? Entry: Image Processing Primer Date: Mon Dec 16 18:21:43 EST 2013 [1] http://engineeringnotes.net/Notes/EE/ImageProcessing.pdf Entry: The absurdity of categories (histograms) Date: Mon Dec 16 18:49:15 EST 2013 Example: household expense categories. Too coarse: no information Too fine: no information There seems to be no intrinsic way to decide what level of detail is good enough. However, there is definitely some level of detail that makes "most sense" in some handwavy way. Where exactly does the sweet spot come from? Is understanding nothing more than applying appropriate courseness to minimize some kind of connective / organizational property? Entry: Transistor model Date: Fri Dec 20 18:44:41 EST 2013 I want a good enough transistor model to run a circuit simulation in real time on a recent PC. What would be a good way to do this? Just use fixed step, high oversampling? Entry: Impedance Date: Fri Dec 20 22:12:03 EST 2013 Talked about this before: there's an interesting difference between "unitless" difference equations and those that operate on two variables like voltage and current. There is a name for this. I keep looking for it. Was mentioned in my 2nd year course on systems theory. Entry: I am a Strange Loop Date: Thu Jan 9 22:32:48 EST 2014 Reading Douglas Hofstadter's sequal to Goedel, Escher, Bach. Notes: - Video feedback: where do the large-scale (1 second) stable pattern come from? Similar to audio mixer feedback: 1/2 second patterns. Entry: Where > and >= are the same. Date: Tue Jan 28 10:17:19 CET 2014 Everyone tells you, you can't compare floats for equality. This is an interesting problem in fact. Equality doesn't really make sense when your only functions are smooth functions, which is often the case. Non-smooth functions are often implementation artifacts, such as mapping (wrapping) the real line onto a circle. True "thresholding" often meeds smooth squashing functions without discontinuities. ( TODO: Find out how this relates to smooth infinitesimal analysis[1]. ) [1] http://en.wikipedia.org/wiki/Smooth_infinitesimal_analysis From rai/prim.h: /* Note that this is based on float to int truncation, so is not the same as the floor() function from math.h For negative integers, the map is n -> n - 1. In practical use in numerical algorithms this difference is OK. Think of it this way: in the presence of noise, there is no real difference between > and >=. */ #define _ float #define i_ int INLINE i_ p_ifloor(_ a) { i_ i_truncate = (i_)a; i_ i_negative = a < 0; i_ i_floor = i_truncate - i_negative; return i_floor; } INLINE _ p_floor(_ a) { _ f_floor = (_)p_ifloor(a); return f_floor; } from rai/stream-lib.rkt: ;; Map phase value to the [0,1] representation interval. ;; ;; Note that floor(n) for n integer and n < 0 maps n -> n - 1. ;; This means that wrap01 for n integer and n < 0 maps n -> 1. ;; ;; This is OK for "phase" values, i.e. those that are at some point ;; inserted into cos(2*pi*phase) or sin(2*pi*phase), since all ;; integers map to a single point. (define (wrap01 x) (- x (floor x))) Entry: Integers don't exist Date: Tue Jan 28 10:38:19 CET 2014 Time for some wacko intuitive math speculation.. I've always found the tension between integers and smooth functions quite interesting. Integers are "meta", in that they appear in the theory _about_ smooth functions, but not in their domain. The real world is a smooth domain, while the _study_ of the real world is a discrete domain. Integers have no reality! An interesting rule of thumb: whenever you run into integers while doing signal processing, you're making a jump to "interpretation", i.e. talking about discrete patterns in the behaviour of smooth maps. In DSP or AI, domains that abstract measurments to interpretation, integers and discrete structures are sometimes called "features". Floating point math is best interpreted as an (apporoximate) representation of smooth functions over smooth domain. The smoothness maps well to the idea of rounding errors in floating point: errors will not amplify as much, and equality of numbers has no precise meaning, while equality or equivalence of smooth maps is well-defined. When thinking about singnal processing, integers should always be understood in relation to topology. I wonder if this idea of link between floats and smooth domains can be made precise. Entry: Exponential IDAC (EIDAC) Date: Sat May 10 19:07:20 EDT 2014 Type: tex A sinking current can be constructed using a NPN BJT, which can pull the voltage all the way down to 0.1V above ground. The collector current $I_c = I_s \exp(V_{BE} / V_T)$ with $V_T = kT/q$ the thermal voltage, approximately 26mV at room temperature. This relation can be used to make an \emph{exponential} converter, which is useful for controlling analog audio synthesis and processing circuits. A typical range for audio frequencies is 4 decades which corresponds to a $V_{BE}$ difference of $V_T log(10000)$, about 250mV. For a 2N3904 NPN at 1mA, I measure $V_{BE}=630mV$. It is possible to remove $T$ from the equation by using two thermally coupled transistors to generate the high and low reference voltage for a DAC. These transistors are biased at respectively the maximum current $I_+$ and the minimum current $I_-$. The linear DAC will then result in a geometrically interpolated current $$I / I_- = (I_+ / I_-)^x \text{ with } x \in [0,1]$$ Temperature compensation is probably a good idea. Sensitivity is $$dI = -I(V_{BE} / V_T^2) dV_T = -I \frac{V_{BE}}{V_T} \frac{dT}{T}.$$ which is better written in relative form $$dI/I = -\frac{V_{BE}}{V_T}dT/T.$$ Looking at ballpark figures $V_T=26mV$ at $T=300K$, $V_{BE}=630mV$ at $I=1mA$ we have $V_{BE}/V_T \approx 24$ or a relative current error temperature sensitivity of about $0.08/K$. This is a thermometer! If $I_- = I_s \exp(V_-/V_T)$ is derived from $I_+ = I_s \exp(V_+/V_T)$ through setting $V_- = r V_+$, then is $I_-$ still temperature-dependent? We have $$I_+ / I_- = \exp((1-r)V_+/V_T) = \exp(V_+/V_T)^{1-r},$$ so if $I_+$ is regulated constant, the ratio $V_+/V_T$ is regulated constant so $I_+ / I_-$ is constant as well. This allows sending $V_+$ as a reference voltage to a DAC referenced to ground. Then using a non-inverting summing amp to average the DAC output weighted $1-r$ with $V_+$ weighted $r$ gives $rV_+$. For a 2N3904 at $1mA$, $r=2/3$ gives a $I_+/I_-$ scaling of \exp(630/(3\times26)) \approx 3000. Entry: Add a little frequency or a little period? Date: Sun Jul 6 18:50:31 EDT 2014 Type: tex Detunig by adding a little frequency is the same as subtracting a little period. With $p = \frac{1}{f}$ we have $$\frac{df}{f} = \frac{d\frac{1}{p}}{\frac{1}{p}} = -\frac{dp}{p}.$$ Entry: Compsable or Separable? Date: Sat Sep 6 20:10:31 CEST 2014 When you say Composable, you really mean Separable. I.e. we think of whole solutions first (top down) then wish for a way to chop the problem into non- or little-interacting parts. A composable construct is then the building material. Entry: Settable and Non-Interfering Signal Functions for FRP - Daniel Winograd-Cort Date: Sat Sep 6 17:13:09 CEST 2014 Take an arbitrary signal function and turn it into a resettable signal function by replacing all delays with resettable delays. [1] https://www.youtube.com/watch?v=zgNRM8tZguY Entry: Manifold music Date: Sun Oct 19 19:14:45 EDT 2014 Dus. 1. one box with rotary encoders. 2. one non-commutative manifold use the knobs to navigate the manifold using local rotations map the manifold coordianates to a musical generator Entry: Circuit simulation Date: Mon Oct 20 07:03:27 EDT 2014 Let's take a closer look at what is involved. Some initial concerns: - How to turn network into procedural code? I.e. not all components are "directional". Most are relational. - Opamp: probably won't work for set-point analysis (relaxation) if not reactive. Start with some simple components: - Ideal opamp: 4 terminals -> 8 degrees of freedom 3 equations: I- = 0 I+ = 0 Vo - Vg = A (V+ - V-) This is a scalar relation. Lift it to an operator equation to also qsupport reactive components. - Ideal capacitor: 2 terminals -> 4 degrees of freedom 2 equations: I+ = - I- C (V+ - V-) = dI+/dt - Ideal resistor: 2 terminals -> 4 d.o.f. 2 equations: I+ = - I- R I+ = V+ - V- To solve: - Collect nodes and equations. - If #eq = #unk, build diff matrix - eliminate algebraic parts? Question: how to handle algebraic parts, i.e. eliminate variables to prepare for solving the diff equation. Is this necessary? Can it all be done in one (stupid) move? They can be done in ping-pong. 1. Pick initial conditions for all diff state variables. 2. Solve algebraic equation for current setpoint 3. Update diff state variables, goto 2. More questions: - What to do for non-linear relations? How to solve a set of nonlinear algebraic equations? - What should the front end for this look like? The non-linear equations probably need to be solved iteratively. I'm guessing that the iteration step for the time direction and that for the "space" direction can somehow be combined. Is it possible to transform the nonlinearities themselves into just time nonlinearities? Just to make the whole equation more uniform. Let's try for a diode. - Ideal diode: I+ = I- I+ = Io e^[(V+ - V-)/VT] Maybe introducing an extra variable here would work? Another observation: currently, the only nonlinearity I'm interested in is the exponential. This has a special update property. But that's maybe not relevant. Explore this first: transform an algebraic nonlinearity in an approximate temporal one. Essentially: an exponential can be computed as the static state of a system. Hmm... I'm thinking this might not work due to symmetry issues. The idea is to "open up" a nonlinear relation by exposing the non-linearity. Let's try again. y = f(x0) + f'(x0) (x - x0) + f''(x0) O[(x-x0)^2] The idea is then to drop the O(.^2) term to compute an update for x and y using the old values of f(x0) and f'(x0), and then recompute these coefficients. Though I'm still thinking about modeling a diode as a diode + series coil and parallel capacitor. If the values are in the correct range, this would give the same low frequency behavior as an ideal diode, but allow the nonlinearity to be isolated from the terminals. /---||---\ | | ---OOOO---o--|>|---o--- Starting this as an open circuit: L current = 0, C voltage = 0. ( Maybe a cause of problems when approximating ciruits with finite things is that when not band-limited, circuits can contain an infinite amount of information? ) Entry: Circuit sim, a-symmetry of static NL transformation Date: Tue Oct 21 09:35:13 EDT 2014 Basic idea: add inertia to a static nonlinearity to make simulation simpler. Ic> /---||---\ | C | L | | V1 ---OOOO---o--|>|---o--- V0 I> Vx Id> I = Id + Ic L DI = V1 - V0 Ic = C DVx Id = f(Vx-V0) 6 variables, 4 equations -> 2 deg freedom Compare to 3, 1 -> 2 (V1,V0,I) when L,C absent So this transformation shields the diode from voltage transients on V1,V0. BUT This introduces a-symmetry: the resulting 2-terminal can no longer be driven by a current source. Is that a problem? And why is this a-symmetry choice so natural? ( There's a deeper philosophical questions: why this a-symmetry in thinking about impedance? V and I are dual. Why do we prefer voltage sources? ) When making one side of the V/I dual, one needs to ask which is the other. What is the V/I dual circuit of this? Is it relevant? How does this relate to I=f(V) V=f^-1(I) duality for the nonlinear element? Start with the linear approximation of Id = Is e(Vx/VT) Id/Is = e(Vx0/VT) (1 + [Vx-Vx0]/VT) Here Vx0 is the value from the previous time step. This approximation is based on the assumption that Vx0 ~ Vx over the range of the update, which for a diode is the right thing. This also gives a retrofitted reason for a-symmetry: at small time scales, a diode behaves as a voltage source. PROBLEM SETTING: Solving a circuit differential equation can be separated in "space" and "time" steps. - Space step satisfies static relations at a specific time, based on the current state of energy storage devices (modeled as current or voltage sources). - The time step approximates computes time evolution of energy storage state based on the static update calculated in the previous step. In this approach, the time update step is just function evalutation, i.e. it is "local", while the temporal step involves the solution of a system of equations, i.e. it is "global". Nonlinearities that include in the time (local) step are easy to handle, but those that occur in the spatial (global) step require a separate solution strategy: from linear to non-linear sets of equations. The idea then is to transform non-linearities from the spatial to the temporal domain by surrounding them by energy storage devices, essentially "low-pass filtering" the approximation errors. This is done in such a way that the functioning of the system is not compromised, i.e. the time constants added would be present in the circuit anyway in parasitic amounts. CONCLUSION: ( intuitive.. quantify this ) - a nonliearity can be "low-pass shielded", but this intruduces a-symmetry: is inertia added to voltage or current? - express the static non-linearity in terms of the "slowed down" variable. ( in the case above it would be Vx ) Entry: V/I Duality Date: Tue Oct 21 09:52:18 EDT 2014 Is there a way to represent voltage and current in a single number? Problem then is the idea of node. A node can have a voltage, but not a current. Is there a way to capture the serial/parallel duality as well? I.e. what is a circuit modulo the V/I duality?[1] [1] http://en.wikipedia.org/wiki/Quotient_space Entry: A slowed-down diode Date: Tue Oct 21 10:29:33 EDT 2014 Starting from the analog circuit: Ic> /---||---\ | C | L | | V1 ---OOOO---o--|>|---o--- V0 I> Vx Id> I = Id + Ic L DI = V1 - V0 Ic = C DVx Id = f(Vx-V0) The transformation from analog to digital is completely captured in the equation: Id/Is = e(Vx0/VT) (1 + [Vx-Vx0]/VT) Now, what about elimiting L and C, and trusting that the global circuit has enough inertia to make this approximation work when applied naively? Is this just a very roundabout way to approximating every static nonlinearity with a linearized version around the previous set point? YES. The key element is to equate the iteration process necessary for solving a non-linear equation with the temporal one necessary to solve a differential equation. So the point is: the previous derivation with L/C filter localizes the effect. In practice, this is probably not necessary if the circuit is slow enough. Entry: Circuits: is connection essential? Date: Tue Oct 21 10:46:34 EDT 2014 Trying to get a good intuition about the bridge between circuits (impedance) and feed-forward difference equations. Suppose the space->time approximation to linearize the connection matrix is appropriate, is the "network quality" essential? Basically, at each time step there is the solution of a system of linear equations. Is the "global nature" of this equation essential to the behavior? In other words: can such a system be factored to localize (pipeline!) dependencies? The idea of "pushing things into the past" is quite central when thinking about distributed computation. In practice: the connection is a matrix where coefficients might depend on past behaviour (making them essentially constants). What about factoring this matrix? I.e. for linear systems it's possible to decouple. For nonlinear ones there might be an essential "global effect". Nonlinearity globalizes. Entry: Spinor Fourier Transform Date: Tue Oct 21 14:40:06 EDT 2014 Our approach relies on the three following considerations: - mathematically speaking, defining a Fourier transform requires to deal with group actions; - vectors of the acquisition space can be considered as generalized numbers when embedded in a Clifford algebra; - the tangent space of the image surface appears to be a natural parameter of the transform we define by means of so-called spin qcharacters. [1] http://ieeexplore.ieee.org/xpl/login.jsp?tp=&arnumber=6507537&url=http%3A%2F%2Fieeexplore.ieee.org%2Fiel7%2F4200690%2F6558795%2F06507537.pdf%3Farnumber%3D6507537 Entry: About static nonlinearities and aliasing Date: Sun Oct 26 23:12:28 EDT 2014 Johannes' blep clipper can possibly be generalized by polynomial modeling of signals combined with polynomial distortion. Evaluating those polynomials at say 8 points and then combining them using a low-pass filter. Entry: Conjugate gradient Date: Tue Nov 4 22:02:38 EST 2014 Basic idea: - 1-D minimum is easy - compute 1-D minimum across n orthogonal directions - here orthogonal is actually conjugate = orthogonal in metric (ellipsoid) defined by convex optimization Formulated as convex optimization problem -> this is why it only works with symmetric, positive definite matrices. Entry: Coming together Date: Fri Nov 7 04:35:42 EST 2014 Some weird synchronicity Looking into circuit theory lately. From a restless night of meandering ended up at John Baez' talks about network theory. Forth and dup/swap/drop in John's diagrams[1] at around 14:00 Got a little jolt after seeing the word 'Fourth' on the slide. Then it goes on explaining the "xor trick" for swapping two variables, only using negation instead of xor. Then it gets weird: bends as relations? How would that work in Forth? And how do classical mechanics and nonlinear circuit theory relate? [1] http://math.ucr.edu/home/baez/networks_oxford/#I Entry: Symplectic Integration Date: Sun Dec 14 08:57:31 EST 2014 Type: tex THIS NEEDS EDIT So what does it all mean? These are notes from/for an email discussion with Kragen[1]. Before going into my notes again: the point of SI as I remember is to preserve some symmetry / invariants when translating a continuous system to a discrete one. The way I understand it is that symplectic structure is a generalization of the idea of complex numbers. See $J^2 = -I$ in [3]. Complex numbers are intimately related to 2D rotations / circles. A linear conservative system is a product of circles (torus). Glossing over details: The time evolution of a system can be expressed as the exponential expressed in terms of the (generalized) complex unit, a hamiltonian and time. Essentially this is a generalization of the 1-D case: $y(t) = y_0 e(-j w t)$, where $w$ is related to the square root of the energy of the system (= hamiltonian). Now if you move from linear to nonlinear, the symplectic structure will be only local - i.e. a property of derivatives. This is where symplectic manifolds come in. So symplectic integration is a way to preserve that local symplectic structure in the hope to keep closer to the real solution when you do discretely integrate a nonlinear equation through successive linearization. I'd like to find a better way to relate all these things. I find it very intriguing but don't have a full picture in my head.. EDIT: To check - Poisson bracket - Canonical transformations % [1] http://twitter.com/kragen % [3] http://zwizwa.be/-/math/20130421-101744 Entry: LU, Cholesky and Hamiltonians Date: Tue Dec 16 09:48:00 EST 2014 Type: tex Since the LU decomposition of a simple example symplectic euler system[1] has a lot of structure, can this be translated to structure of matrices? The structure is $(I+E^T)(I-E)$, where $E$ is nilpotent $E^2=0$. % [1] entry://20130421-101744 Entry: Hamiltonian matrix Date: Tue Dec 16 10:26:44 EST 2014 [1] http://en.wikipedia.org/wiki/Hamiltonian_matrix Entry: Simplest symplectic manifolds Date: Tue Dec 16 10:33:40 EST 2014 Type: tex Let's take a constructive approach to get a better idea of how the spaces (manifolds) and Hamiltonians $H$ relate. The simplest space would be a 2D plane $(q,p)$, where $q$ denotes position and $p$ denotes momentum. Ignoring constants, this is the phase space of a free point particle traveling on a line $H(q,p)=p^2$, and the harmonic oscillator $H(q,p)= q^2 + p^2$. In these simple contexts, what does $dp \wedge dq = 0$ mean? A more interesting question might be: why is the symplectic Euler discretization stable? And why is it slighty skewed? Entry: computation is topology Date: Mon Dec 29 22:02:27 EST 2014 Not sure how to make that explicit, but if you look even at just the practicals: most effort in electronics and programming is about wiring things together. Somehow coming from a DSP background I got mislead by the whole "FLOPS" approach: how many arithmetic operations can the thing do per second. Entry: PLL stability Date: Sat Mar 7 19:13:29 EST 2015 Trying to wrap my head around stability issues in a PLL design. The question that comes to mind is: how can it ever synchronize if the initial phase variation is so large that the filtered can't "catch" it? I.e. I understand that a PLL is stable once it is locked, as it is close to its linear approximation, but how can it lock in the first place? How good does the initial guess need to be? It seems that the simplest PLL has 2 design variables: low pass cutoff and loop gain. My case has another one: frequency offset. It seems that offset can be a problem if it gets too far away from the locking range. From [1] it seems all filters are integrators. [1] http://cp.literature.agilent.com/litweb/pdf/ads2008/dgpll/ads2008/PLL_DesignGuide_Reference.html [2] https://electronics.stackexchange.com/questions/86059/what-is-phase-lock-looppll-lock-range-capture-range Entry: DPW Date: Mon May 25 11:51:47 CEST 2015 Google for "Differentiated Parabolic Waveform": 1/p * diff((p-1/2)^2) There is a implementation in RAI: look for saw-d2 in https://github.com/zwizwa/rai/blob/master/synth-lib.rkt Entry: deep learning Date: Sun Jul 5 22:57:36 EDT 2015 [1] https://charlesmartin14.wordpress.com/2015/03/25/why-does-deep-learning-work/ [2] https://charlesmartin14.wordpress.com/2015/04/01/why-deep-learning-works-ii-the-renormalization-group/e-t [3] https://www.quantamagazine.org/20141204-a-common-logic-to-seeing-cats-and-cosmos/ Entry: Commutation Date: Sun Aug 2 18:56:32 EDT 2015 See in Haskell, meta/siso: e.g. commutation between representation and list: A list of representations of e :: l (r e) is not the same as a representation of a list of e :: r (l e). Entry: category theory / algebraic topology Date: Wed Sep 16 20:51:26 EDT 2015 https://lobste.rs/s/6gfcsd/category_theory_a_gentle_introduction Entry: Complex derivative hack Date: Sun Oct 4 16:57:50 EDT 2015 https://codewords.recurse.com/issues/four/hack-the-derivative Entry: Bisecting intervals with gaps Date: Fri Nov 20 12:09:18 EST 2015 Find maximum value on a strictly increasing sequence on a circle (last element in a circular buffer). This works by successive bisection refinement, picking an initial interval by arbitrarily picking a point: (M > L) ? {M,R} : {L,M} 3 4 5 6 7 0 1 2 3 L M R How to adjust this to be able to bisect ranges with gaps in them 0 1 2 3 4 5 6 7 8 9 index 3 4 5 . 6 7 0 1 2 3 value L R The locations of the gaps are not known beforehand. Reading is expensive. What to do when index 3 is read? Guess: - switch to decision rule: M >= L ? {M,R} : {L,M} - copy value from next block Entry: CT and declarative programming Date: Mon Jan 11 16:00:33 CET 2016 http://bartoszmilewski.com/2015/04/15/category-theory-and-declarative-programming/ Entry: Hamiltonian signals Date: Sat Jan 23 20:27:19 EST 2016 - Chaotic - Simplectic update through first order ODEs derived from Hamiltonian - Also use Hamiltonian for re-normalization using constant energy constraint - For oscillators: tune simulation parameters based on frequency estimator / regulator. How to generate interesting systems? Swinging Atwood's machine: https://www.youtube.com/watch?v=3ajr5Fb2i1g https://en.wikipedia.org/wiki/Swinging_Atwood's_machine Pendulum's reactive centrifugal force counteracts the counterweight's weight. The force tends go get quite high corresponding to the high angular velocity (slingshot), and a smooth "bounce" on the counterweight. The ODEs have singularities for points passing through r=0. I wonder if it is possible to modify the geometry to remove those signularities. E.g. instead of the 1/r^2 in the Hamiltonian, to use a 1/x+r^2 Phase-space is 4D, but due to energy preservation, the points are constrained on a 3D manifold. For the 3-body problem, as e.g. in these two fixed masses: https://www.youtube.com/watch?v=uN3IwAD1hdQ It is possible to guarantee boundedness based on total energy. Because the singularities are points, the chance that a randomly chosen initial condition passes through a singularity is zero. However, due to finite approximation it is not. Regularization is probably a good idea. ( Regularize as a bounce? Or smooth out the hamiltonian. ) For integrated simulation, the singularity actually doesn't matter as the infinite fource only happens for an instance. What actually happens when a point passes through the singularity? Large-scale n-body: https://www.youtube.com/watch?v=byI9yhITDsM Entry: Biliart ball chaos Date: Sat Jan 23 21:40:56 EST 2016 See [1]. Chaos is possible if the triangle (polygon?) does not tile the plane (or tiles the plane chaotically?). The geometry generated by a non-tesselating polygon is "piece-wize curved". What's a proper term for that? Some other related [2][3]. [1] https://www.youtube.com/watch?v=3ajr5Fb2i1g [2] https://plus.maths.org/content/chaos-billiard-table [3] https://en.wikipedia.org/wiki/Ergodicity Entry: Colliding point masses. Date: Sat Jan 23 22:58:12 EST 2016 What happens in the equations of motion if two point masses collide? They will move past each other in a straight line, and will have only an infinitely smal moment of infinite fource. How to represent that without the singularity? A way to look at this is to parameterize the phase space trajectory in terms of a regularization of the potential energy: 1 / (a + r^2) And then take the limit of that trajectory as a->0. ( Can tricks be used as in infinitesimal analysis? ) Related? http://mathoverflow.net/questions/97391/how-to-deal-with-the-singular-reduction-of-the-hamiltonian-n-body-problem Wait.. Entry: base64 Date: Thu Feb 25 23:39:09 EST 2016 6 bits per character 2*3 vs 2*2*2 *4 *3 so 4 characters encode 3 bytes. how is padding handled? https://en.wikipedia.org/wiki/Base64#Padding Entry: Cascaded integrator-comb filter Date: Wed May 18 18:41:08 EDT 2016 Replaces the standard interpolation/reconstruction filter with integrator/differentiator (comb) pairs. Effectively creating a moving average (rectangular) FIR filter. 1-stage: rectangular 2-stage: triangular 3-stage: 2nd order bump ... [1] https://en.wikipedia.org/wiki/Cascaded_integrator%E2%80%93comb_filter Entry: Linear Logic - Session Types Date: Sat Aug 6 22:16:09 EDT 2016 http://homepages.inf.ed.ac.uk/wadler/topics/linear-logic.html https://www.youtube.com/watch?v=IOiZatlZtGU (Q&A) https://www.infoq.com/presentations/category-theory-propositions-principle CCC : we can treat functions as data Entry: What is it like to understand advanced mathematics? Date: Sun Aug 14 20:17:50 EDT 2016 https://www.quora.com/What-is-it-like-to-understand-advanced-mathematics/answers/873950?srid=p6KQ&share=1 Entry: Inplace matrix transposition Date: Mon Aug 29 12:49:44 EDT 2016 https://en.wikipedia.org/wiki/In-place_matrix_transposition Entry: Packetized TDM as coordinate transformations Date: Mon Aug 29 23:46:13 EDT 2016 For lack of better name, packetized TDM meaning: to transmit a number of channels in block form, e.g. N samples from channel 0, followed by N from channel 1. Just a practical format I ran into, but this is to hint at the general idea of chunking which tends to happen in nested form and can get confusion. Therefore it makes sense to look at TDM as coordinate transformations An important thing to realize is that this form of TDM can be seen as a 2-step process: the introduction of an extra dimension followed by the removal of another: 1) packetize: splitting t into a 2-dimensional (p,i) coordinate 2) multiplex: flattening (c,p) into a 1-dimensional p' coordinate (c,t) // channel(finite), time(infinite) <-> (c,p,i) // channel(finite), index(finite), packet(infinite) <-> (p',i) // index(finite), flattened packet+channel (infinite) p = t div N i = t mod N p' = p * N + i c = p mod N t = p * N + i p = (p' - i) div N Entry: Category Theory lectures - Bartosz Milewski Date: Fri Sep 2 19:12:51 EDT 2016 1.1 https://www.youtube.com/watch?v=I8LbkfSSR58 Compares Kmett's way of generating librarys from CT to C++ TMP deriving from functional programming ideas: implementation in one language (concrete) derived from ideas in the higher level mathematical structure. 1.2 https://www.youtube.com/watch?v=p54Hd7AmVFU Everything about a category is encoded in the composition table of the morphisms. 2.1 https://www.youtube.com/watch?v=O2lZkr-aAqk 2.2 https://www.youtube.com/watch?v=NcT7CGPICzo Entry: Running mean/variance Date: Mon Jan 2 18:02:56 EST 2017 2s^2 = (a - m)^2 + (b - m)^2 = a^2 + b^2 - 2m^2 with 2m = a + b general case: s^2 = 1/n sum x_i^2 - (1/n sum x_i)^2 Entry: Merging Date: Tue Apr 11 21:35:39 EDT 2017 Problem: - Given two time series, match them up based on them not being exactly matched. Entry: woven balls Date: Fri May 19 00:31:16 CEST 2017 Cover sphere with random path. What's the pattern behind a woven ball and why does it look random but also spherical? Is it a variant of a dodecahedron? ( Context: parameter exploration in high-dimensional spaces ) http://mathtourist.blogspot.be/2013/06/pentagons-and-game-balls.html http://www.instructables.com/id/10-strip-woven-ball/ Seems these are multiple strips, not a single one. Entry: Derivatives Date: Fri Sep 8 22:21:28 EDT 2017 The derivative is an operator that maps a function to a linear function. I'm looking at applying the chain rule to functions over trees. What does it mean for a function on a tree to be linear? Linearity in algebraic terms is: f( a x ) = a f(x) What does multiplication mean on trees? Are these zippers the only interpretation that works, or is there a different way that works with differences of trees. Entry: On intuition Date: Fri Sep 8 22:24:11 EDT 2017 I've got good intution (my non-symbolic mind sees hints of things), and I can do the mechanical symbolic manipulation if I have a starting point. However, making the bridge is sometimes really difficult, because the "language" spoken by the intuition is really out there and is _very_ tolerant to missing details. Maybe that is what intelligence is: seeing patters without seeing the entire mechanics? Entry: duality Date: Sat Nov 18 01:50:35 EST 2017 De morgan, and categorical duality. And/or vs products/sums. https://en.wikipedia.org/wiki/Coproduct Entry: De Morgan vs. product/coproduct Date: Thu Dec 28 18:12:55 EST 2017 De Morgan's duality is straightforward. When expressed in ^,v,~ ~(~A v ~B) = A ^ B ~(~A ^ ~B) = A v B The duality operation is to flip all polarities, and exchange v and ^. Why is product/sum duality so convoluted in functional programming notation? A good starting point for looking at this is category theory's products and coproducts. E.g from here https://en.wikipedia.org/wiki/Coproduct Why are products and coproducts "dramatically different"? In general, why are categorical duals so different? Entry: duality: product and coproduct Date: Fri Dec 29 11:49:41 EST 2017 Duality "same-ness". Since duality observations are used a lot, and products and coproducts are the perfect example of this due to their relevance in algebraic types, it might be a good starting point to gain some intuition. Start with sets: p: carthesian product c: disjoint union Those definitions are clear. What is not clear is the sameness. Simplified to boolean logic (De Morgan's laws) this becomes clear, but even for algebraic types, the difference between sums and products is stark. So why is this same-ness at all? https://en.wikipedia.org/wiki/Product_(category_theory) https://en.wikipedia.org/wiki/Coproduct Going from the wikipedia pages. W.r.t. duality: an important distinction is about arrows arriving and leaving at objects. That is likely where the "dramatic difference" comes from. Let's juxtapose: Given f1:Y->X1, f2:Y->X2, there exists f: Y->(X1,X2) f1:X1->Y, f2:X2->Y f: (X1+X2)->Y Here, the tupling is completely obvious, but the sum is not. Wait a miniute: is this about the existence of f1 or f2 in the case of the coproduct? No. Both f1 and f2 exist. Also, X1 and X2 live in the same category. Otherwise diagrams would be meaningless. So the coproduct is closed. That last one is important. In the category of sets, both carthesian product and disjoint union are closed binary operations connecting two (possibly distinct) objects to a third one. The existence of products and sums is a property of the category. Given the "constructors" for product and sum, the existence of projections,injections and composite arrows is assumed. So really, it doesn't say much. It only talks about how the components relate to the composite. Taking a->b to mean logical implication (a' v b), this duality is the same as that introduced by negation in boolean logic: y->x1, y->x2 => y -> (x1 ^ x2) x1'->y', x2'->y' => (x1' v x2') -> y' Write an example using sets, starting with the objects X1={1,2} X2={2,3} Product = {(1,2),(1,3),(2,2),(2,3)} Union = {1,2,3} These are really different animals. An illustration of how much concrete details are thrown out in CT. Entry: Agda, coproducts Date: Fri Dec 29 13:48:44 EST 2017 Coming back to proofs in agda. Why does it feel so asymmetric to prove things using case analysis (disjunction) as opposed to tupling (conjunction)? I still don't have a good answer as to why disjuction and conjuction feel so different, while somehow "being made of the same stuff". Entry: sparsity Date: Fri Jan 5 02:42:34 EST 2018 https://lists.samba.org/archive/rsync/2014-June/029495.html intersection/union triggered an association: this is S-D encoding Basic idea: 2% of neurons is on. This is a normalization. That's exactly what reminds me of the SD stuff: logical operations might have some other algebraic meaning if they've got enough decorrelation and are far enough from the saturation (full on or off). spectra, colors, distributions sparse distributed representations (SDR) https://www.youtube.com/watch?v=6ufPpZDmPKA overlap "means something", because random overlaps are very rare. density is low, such that the chance of overlap becomes low, which is a key property (overlap chance is approx s^2 for small s). conclusions to be made in case s is small: - existence of overlap is not likely due to chance - it's ok to match a subset to test for a pattern - noise can be mitigated Btw matches in overlap sets are bloom filters. Entry: The important part is concepts Date: Mon Feb 12 23:51:25 CET 2018 Without those, there are no connections. Put differently, finding a good way to look at something is most of the work. Once established, connections often become obvious. Entry: Sums and Products Date: Thu Mar 15 17:14:33 EDT 2018 Back to the duality thing. While products feel natural (tuples), sums are harder to represent. I.e. they need tags. Where does the a-symmetry come from? Is this related to code vs. data? Produces are different locations in data memory, and sums are different locations in code memory (i.e. branch based on tag). Entry: Sums and Products.. Date: Sun May 27 15:26:58 EDT 2018 Maybe this is just syntax? if' will do case analysis that looks like it is a product. The thing is maybe that different cases have different matching, and that you would use sums for analysis and products for constructions? I.e. they are dual that way? I'm not really sure where this is going or if it even makes any sense.. Entry: Brouwer Date: Fri Jun 15 02:23:15 EDT 2018 Homotopy Type Theory introduction lecture -- Robert Harper https://www.youtube.com/watch?v=ISq1xi-mGk8&list=PLo61mJSqK5cXHsnYUAXNea31BCj1u-zln Brouwer: Proofs are mathematical objects. Different idea that just formal proofs (Goedel's theorem is about formal proof being strictly limiting, compared to the proof "outside"). Difference between HoTT way of thinking about proofs and just proofs in set theory encoded in Coq. Synthetic (abstract) vs. Analytic (nuts & bolts, Bourbaki tedious). Lecture 2: Entailment (Logical consequence) Mapping (entailment) becomes internalized (implication). Entailment and implication are not the same thing: https://philosophy.stackexchange.com/questions/12816/difference-between-implication-conditional-and-logical-entailment Entry: Intuitionistic logic Date: Sat Jun 16 22:22:51 EDT 2018 Before getting into HoTT, it's probably best to do a little more constructive logic and category theory first. https://en.wikipedia.org/wiki/Intuitionistic_logic Entry: correctness proofs Date: Thu Jun 21 11:54:16 EDT 2018 It's all about case analysis, e.g. sums, which cause multiple code paths. Products are somewhat automatically handled by data structures. (or free? I don't dare to use that word any more). This ties into some itching intuition I have that while producs and sums are essentially the same (duals), they are very different in a practical sense when it comes down to proof and program structure. Why is this? Is there some insight to gain here, or is this fever dream stuff? Entry: Constructive analysis Date: Tue Jun 26 15:42:16 EDT 2018 https://en.wikipedia.org/wiki/Constructive_analysis IVT: rephrase as approximation algorithm, essentially. This does seem like a more appropriate way to talk about things when computers are involved. Entry: intuitionistic type theory Date: Fri Jul 5 18:26:55 EDT 2018 Going over the wikipedia page https://en.wikipedia.org/wiki/Intuitionistic_type_theory What I miss in that page is a bit more of a direction. Clearly I'm expecting something else than what is exposed... I get the 3 primitive types: Bottom, Unit and bool. But the type constructors are a little strange to me. Especially why the Sigma is defined in terms of products. Also, it says that dependent types model predicate logic. How? Entry: Kalman filter Date: Fri Sep 14 21:39:33 EDT 2018 Objective: reconstruct state sequence given measurement and system parameters. https://en.wikipedia.org/wiki/Kalman_filter Entry: Gravity peak detection Date: Tue Sep 18 12:05:48 EDT 2018 I was thinking about how not to add arbitrary segmentation algorithms for quick and dirty peak detection. What about using a form of "gravity", where peaks will first coalesce, and then a peak detection can be used? The problem is that peaks will move towards each other, so maybe not what I'm looking for. I guess that this is just always going to be a little ad-hoc, where there is a combination of numeric clustering, followed by a discrete peak selection phase. Entry: clustering is not well defined Date: Mon Nov 26 14:36:03 EST 2018 https://twitter.com/MarkKriegsman/status/1067125849315524618 https://towardsdatascience.com/the-5-clustering-algorithms-data-scientists-need-to-know-a36d136ef68 https://en.wikipedia.org/wiki/Determining_the_number_of_clusters_in_a_data_set Links into intuition to keep things continuous, or at least "density-encoded". Entry: Dobble Beach Date: Sat Jan 12 16:44:06 CET 2019 6 images per card 31 (30?) cards for any 2 cards, only 2 images the same how is this possible? edit: discrete projective geometry Entry: Universal coalgebra: a theory of systems Date: Fri Feb 22 12:13:10 EST 2019 https://www.sciencedirect.com/science/article/pii/S0304397500000566 https://en.wikipedia.org/wiki/Coinduction https://en.wikipedia.org/wiki/Corecursion Entry: Transaction matching Date: Thu Feb 28 20:12:17 EST 2019 Link transactions between accounts, but allow for a little slack in date. How to express this problem precisely? Say M is the main account, and S is a sub account that needs to be fully explained. There will be a set of solutions to this. - If the set is empty, something is wrong with the data. - If there are multiple such sets, pick one based on: - Metadata distance? This might be hard - Date distance - Random association - Combination Expressed as a join, how can the uniqueness characteristic be expressed? It is a global property, so I don't really see how. A sequential greedy algoritm: - for each transaction in S, find a transaction in M that has the same amount by sequentially searching pre/post time and removing the corresponding transaction from M This seems to be easy enough to implement. Imperative in-place update is easiest. The main ledger should be ordered to make searching easy. More general problems: https://brilliant.org/wiki/matching-algorithms/ Entry: Bezier Curves Date: Tue Mar 26 22:40:55 EDT 2019 Simpler to understand as iterated linear interpolation: https://en.wikipedia.org/wiki/De_Casteljau%27s_algorithm Entry: Embedding manifolds Date: Thu Apr 4 14:44:58 EDT 2019 Application domain: I have a system where the central measurement is phase difference between two oscillators, and I need to continously trak this difference, keeping track of "wraparound" based on a well-known initial point. The main insight is to keep the representation of phase "embedded" in the representation of a quadrature signal, as opposed to flattening it out into a single number that cannot represent the idea of rollover in a continous way. I think there is a generic field here: e.g. representation of Lie groups as sets of matices. Can this be generalized? I.e. there are more cases where discontinuities and ambiguities are a consequence of the representation, and not intrinsic to the problem. Riemann surfaces come to mind as well. http://mathworld.wolfram.com/RiemannSurface.html https://en.wikipedia.org/wiki/Riemann_surface Entry: autodiff and recursive eval Date: Thu Apr 11 22:33:55 EDT 2019 Exponentials can be evaluated by update equations. Does an autodiff form of such a program look different from a computed form? Entry: Mapping fractions to integers Date: Mon May 6 21:27:37 EDT 2019 A straightforward way would be to use the prime factorization, and "interleave" them. If p[n] is the sequence of primes, split this in odd/even and use them to build a new number. Entry: The importance of reframing Date: Sun May 12 17:59:51 EDT 2019 I wish somebody had explained this to me earlier in my life: Easy problems are those where it is always clear what the next step would be, and where each individual step leads to more clarity. The sequential nature of discovery is something very hard to appreciate if you've not experienced it. Hard problems are those where it is not clear what the next step would be. It often requires meta-insight: why is there no insight coming? How can you think of yourself as a tool that is not used properly? Entry: Different sync encoding Date: Mon Aug 26 23:16:58 EDT 2019 Pick one special byte to act as packet boundary. This leaves 255 values. A n in m encoding would satisfy 255^m >= 256^n basically, encode in base 255 instead of base 256. The midi 7 in 8 encoding is a special case of that, leaving 128 control characters (e.g. with high bit set). Is there a natural chunk size to do that? Say that we only use 64 codes. This is 0.5 bits per byte that are missing. Can this be 15 in 16? Or can it be 255 in 256? Too tired and wacky brained to do this intuitive shit.. At the bit level, bit stuffing can be used. This can also be done in uart bytes. If the last bit is 0, this was a 0xFF byte, and the entire stream should be shifted by one. That would be more efficient, but requires a more complex decoder. Entry: CRC Date: Wed Oct 23 09:20:49 EDT 2019 EDIT: Principle is right, but implementation details requires some additional assumptions. See end. CRC is modulo reduction using an irreducible polynomial. I.e. long division. However, how does a bit stream actually map to the input polynomial? And why does accumulator chaining work? Let's try to perform a division and fill in the gaps. x^3 + x + 1 is 1011 Dividing 10101010 by 1011 10101010 | 1011 10110000 |------- -------- | 10011 11010 10110 ----- 1110 1011 ---- 101 Only the remainder is useful. How to turn this into something that takes a single bit at a time? Note that only the first 4 bits of the dividend are necessary. That is the key to turning this into a state machine. To visualize, spell out the zero subtractions explicitly. 10101010 | 1011 10110000 |------- -------- | 10011 0011010 0000000 ------- 011010 000000 ------ 11010 10110 ----- 1110 1011 ---- 101 Then mask out the bits that do not need to be known at each step, feeding in a bit when it is necessary. 1010.... | 1011 10110000 |------- -------- | 10011 0011... 0000000 ------- 0110.. 000000 ------ 1101. 10110 ----- 1110 1011 ---- 101 The trick here is that there is no carry! This allows us to carry on a step without knowing the downstream bits! This would not work so smoothly with an addition operation that carries over into lower significant bits. To restore symmetry, add zero bits. ........ 00000000 -------- 1....... 00000000 -------- 10...... 00000000 -------- 101..... 00000000 -------- 1010.... | 1011 10110000 |------- -------- | 000010011 0011... 0000000 ------- 0110.. 000000 ------ 1101. 10110 ----- 1110 1011 ---- 101 This property allows to just roll the algoritm. So the "accumulator" is essentially the first 4 digits of the intermediate, which will end up as the full modulo at the end. This all seems plausible. So let's implement it and see if it holds up. There are essentially only two operations: - devision reduction step - shift in a new bit The polynomial is typically represented as bits without the top bit which is always 1. In our case, the poly residu is 011. Suppose the accumulator is 1010 The algorithm step is then: - store the top bit - shift in the new bit - xor with residu if top bit was 1 Now I just need to know what the bit order convention is. It seems that shifting in MSB first makes most sense. Let's check this with a crc8 plucked from the internet. EDIT: This mentions padding at the end. https://en.wikipedia.org/wiki/Cyclic_redundancy_check Still not quite clear. I'm doing something else. https://rosettacode.org/wiki/CRC-32 I think I have the definition wrong. From the wikipedia page: This is first padded with zeros corresponding to the bit length n of the CRC. This is done so that the resulting code word is in systematic form. https://en.wikipedia.org/wiki/Systematic_code A systematic code is any error-correcting code in which the input data is embedded in the encoded output. I don't understand this. But what I am doing is padding with zeros at the start? So I give up. Interpret and generalze a working implementation instead: https://stackoverflow.com/questions/21001659/crc32-algorithm-implementation-in-c-without-a-look-up-table-and-with-a-public-li So CRC32 and CRC32B are not the same: https://stackoverflow.com/questions/15861058/what-is-the-difference-between-crc32-and-crc32b Entry: Financial transactions and density Date: Sat Nov 16 21:40:10 EST 2019 Some posts in the past around 20120910. The main idea: - predict future based on past - set targets, given a margin I'm still not quite sure how to get the trend out of the noise, but a key element seems to be to divide into categories (which is what gives an idea where the money is going in rough terms), and apply smoothing by filtering each category using a method that distributes reserves while maintaining thresholds. E.g. it is possible to take one paycheck and split it into one that is payed at the original time, and one that is payed 1/2 of the period later. Or in the limit, spread the amount over the entire "future" payment. Same with expenses, but in the other direction: it is always allowed to pay a debt early. This is a reflection of the main property of money: You can always not spend money that you have avaiable, but you can't spend money that you do not have available. That is the primary "shape" of the space we're dealing with. The mangement of money is about managing those hard deadlines. Now in practice, deadlines are not completely hard, but debt does tend to spiral. So it is best to leave a safety margin. Ideally, any partition of expenses and income can be smoothed by performing time-distribution, in such a way that the resulting "curve" is as smooth as possible, and can be used as a trend to make decisions. So what this does is to give you smoothnes, but it consumes "buffer space", i.e. there needs to be an amount of cash that can handle these fluctuations without running out. Essentially, we want to extract the following information: - Trend: plot smoothed income vs. expenses. This helps with planning future income, trading study vs. work. - Buffer: what is the size of the variability, determining the need for reserves. Note that smoothing might influence this. - Feedback: if trends for categories can be clearly identified, optimization effort to reduce expenses can be directed. Compared to previous insight related to "density" ideas: it is now clear that apart from the need for smoothing to be directional, it seems fairly straightforward to implement. E.g. a piecewize linear curve per year, quarter, or month is probably already enough. Multiple scales could be used. Other base functions could be used. What is clear is that the main problem is introducing "goldy locks" categories. Too many or too few and no information can be extracted. Entry: protocol analyzer Date: Sun Feb 16 19:10:37 EST 2020 Starting from 8 bit, how to find the logic levels? Assume the levels will not change, so start with constructing a histogram. This will be strongly bimodal. How to find those modes? The simplest approach would be to find the local maximum starting from above and below. Then pick the mid-point, and use a ST to convert it to digital. Now, what about 3-level? E.g. I have a RS485 line that has an idle state at say 2/3 between high and low. Alternative: - logaritmic histogram - average (center of gravity) - split based on center of gravity - linear center of gravity for each - midpoint Different approach: given midpoint, compute variance of the 2 sections. Minimize variance. This can be done exhaustively. This likely also works for 3-modal. Another: hill-climbing from all initial conditions. Another: 3-modal: find first 2 modes using left/right local maxima, then bisect hillclimb until disctinct 3rd pops up. the hill-climbing require smoothing. Another: fit a single sine wave to a smoothed histogram. https://en.wikipedia.org/wiki/Cluster_analysis https://en.wikipedia.org/wiki/K-means_clustering https://en.wikipedia.org/wiki/Voronoi_diagram Forgot: for differential line, it is actually completely trivial. By analogy, setting the clusters manually also solves the problem. Entry: The D-transform Date: Sat May 9 15:05:26 EDT 2020 A while ago I ran into something I believe was called the D-Transform. It is a reformulation of the Z-transform based on the transformation d=z-1. Recently this idea has come up in private conversation related to the field of audio processing and synthesis and analog modeling. The basic setting is to oversample a signal to remove digital sampling artefacts. That part is quite straightforward, but doing so exposes numerical instability when signal operations are expressed in terms of differences between subsequent sample values, as is usually the case. Instead of keeping track of delays, it is possible to reformulate filter topologies by keeping track of differences, all the way assuming that signals do not change much from one to the other. In essence it is a way to use the available precision where it is best used. Some references: 1. Finite Difference Equations - H. Levy and F. Lessman isbn://9780486672601 At least I think that's quite similar to the '61 first edition I found in a used book store. 2. TODO: Some previous log entries. Entry: SVF vs. biquad Date: Mon May 18 08:15:54 EDT 2020 Implementing filters has been a while. I think this time I want to do it with a proper test setup. Bascially I want to create some C code, and have it compute some analysis. Entry: Gyrator, Symplectic Date: Sun May 24 23:47:33 EDT 2020 https://en.wikipedia.org/wiki/Gyrator Entry: Computing logarithm Date: Sun Jun 7 00:40:22 EDT 2020 Implemented this in fixed point: https://en.wikipedia.org/wiki/Logarithm#Feynman's_algorithm But polynomial is going to be much faster: https://math.stackexchange.com/questions/61279/calculate-logarithms-by-hand Entry: Low frequency non-linear model updates Date: Sat Jun 13 21:11:23 EDT 2020 Running into this interesting problem: frequency calibration of a synth. First I attempted a fairly standard integrating regulator, which really didn't work out, mostly because of the delay on the frequency measurement. Then I realized the parameter I want to model (temperature) is much slower moving, so the idea was to split it up into two parts: - a feedforward control based on a model - low frequency update of the model (e.g. 1/sec) after initial explicit calibration. The model will likely be a polynomial compensating for the nonlinearity in the system. This however is temperature-dependent. Entry: The 'd' transform Date: Thu Jun 18 22:40:28 EDT 2020 Ok I remember now. I've been looking at the d=z-1 transformation to implement analog style filters more directly at high sample rates. The problem is that there is a direct path in the d operatior, so it can't be realized with feedback. So that is a dead end. But it is possible to find different topologies that have good numeric stability by not subtracting numbers after multiplication. I made a deriviation in this spirit starting from the svf, but split up in three update stages: a loss stage and something resembling a ping-pong lossles symplectic integrator step. E.g. a product of 3 triangular matrices: p' = p - b p p+ = p' + a q q+ = q - a p+ Which can be implemented sequentially using code below. GCC compiles the following C representation to the proper high-word MAC opcodes on ARM Cortex M. #define I64(x) ((int64_t)(int32_t)(x)) p -= (I64(b) * I64(p)) >> 32; p += (I64(a) * I64(q)) >> 32; q -= (I64(a) * I64(p)) >> 32; Now to make an arbitrary 2nd order section, add input, bypass and output coefficients. Which probably can be done sparse as well. Entry: Higher order sigma-delta Date: Sat Jul 4 15:07:55 EDT 2020 The framework to analyse this is filtered error feedback: the output is regulated to the input, the error is computed and fed through a loop filter that computes the next output. The quantizer can be approximated by a linear segment (just a passthrough) and a noise source. From what I read, instability issues only pop up in the 1-bit case, where the quantization noise is relatively large. I am in a situation where I have about 8-12 bits of output PWM precision. It is just a filtering problem, as the error is shaped by the inverse of the loop filter. This mentions that dithering is probably necessary to decorrelate: https://en.wikipedia.org/wiki/Noise_shaping Entry: Third order Date: Sun Jul 5 23:12:36 EDT 2020 Why does this work, i.e. with each integrator being fed an error signal, and not just chaining 3 integrators and feeding it in the front? struct pdm3 { uint32_t s1; uint32_t s2; uint32_t s3; }; static inline uint32_t pdm3_update(struct pdm3 *p, uint32_t input, uint32_t out_shift) { uint32_t out_q = p->s3 >> out_shift; uint32_t out_a = out_q << out_shift; p->s1 += input - out_a; p->s2 += p->s1 - out_a; p->s3 += p->s2 - out_a; return out_q; } The explanation is that an order n modulator is made by taking an order 1 modulator, and replacing the quantizer with an order n-1 modulator. But why is this so different? Entry: smoothing transaction graph Date: Thu Jul 23 21:05:36 EDT 2020 I've been looking for a while to find a way to smooth a discontinuous transaction graph, e.g. a bank account. Maybe treating it as heat and then running the heat equation? EDIT: So that is just gaussian blur... https://en.wikipedia.org/wiki/Weierstrass_transform