Sun Feb 22 15:08:07 CET 2009

Dual Numbers and Automatic Differentiation

Extending the real numbers with e satisfying e^2 = 0 produces the
dual numbers, equiped with extended elementary operations.  Besides
being nilpotent, e behaves as a real number.

A dual mumber has two components a and b, and is denoted as

   a + b e

Given an analytic function f(x) over the real numbers, the extended
function over the dual numbers has the following remarkable

   f(x + e) = f(x) + f'(x) e

This can be derived starting from the power series expansion of f(x)
at x = 0, and lifting all operations to those of the dual numbers.
The entity e effectively selects the first derivative from the power

Practical significance: given an algorithm for computing f(x), an
algorithm for computing both f(x) and f'(x) can be automatically
derived by extending all elementary operations to their equivalent on
the dual numbers.

The "magic" is in that in general, single-expression function
derivatives _look_ much more complicated than the original function
expression.  This complication is artificial: it can be alleviated by
introducing proper decomposition.  This is exactly what AD does:
complexity doesn't really rise, but the amount of data that is tracked
in the computation is increased: there are more local dependencies in
the data flow, but global size doesn't explode.

Instead of using dual numbers, the same mechanism can equivalently
be explained as explicity tracking both function value and
deriviative, and updating them using the chain rule.

[1] http://en.wikipedia.org/wiki/Automatic_differentiation