Forward-mode automatic differentiation

The main goal of this lecture is to introduce the concept of automatic differentiation and provide a solid understanding of how it works. Here we will build a basic Python version of forward-mode automatic differentiation, which will be used to calculate the Taylor series expansion of the molecular quantum-mechanical kinetic energy operator. This result will be utilized in the next lectures to calculate the molecule vibrational energy levels and spectra, which will be discussed in detail later (see Vibrational solutions for molecules).

First, let us briefly examine two alternative methods for computing derivatives, namely, finite differences and symbolic differentiation, before delving into the depth of automatic differentiation.

Finite differences

The traditional way of computing derivative of a function is to use the finite-difference approach, for example, in its simplest form of a two-point stencil for the first derivative

$$ f'(x) = \lim_{h\rightarrow 0} \frac{f(x+h)-f(x-h)}{2h}, $$

and a three-point stencil for the second derivative

$$ f''(x) = \lim_{h\rightarrow 0} \frac{f(x+h)-2f(x)+f(x-h)}{h^2}. $$

Since we cannot divide by $h=0$, we choose $h$ to be sufficiently small, which results in an error in the derivative’s value that is called the truncation error. On the other hand, selecting $h$ to be too small will result in an error and loss of accuracy owing to the restricted binary representation of the number in computer memory. The number gets rounded off after some point to be stored in a binary format. The resulting error is called the round-off error. Hence, the finite difference approach suffers from the round-off errors, when $h$ is too small, and truncation errors, when $h$ is too large.

While the overall error for a first derivative may be modest, when calculating higher order derivatives or utilizing first derivatives in other operations (such as machine learning), the error begins to propagate and grow in proportion to the number of operations. It is in principle possible to reformulate the limit rule to produce less truncation error (see e.g., Wiki: Finite Difference Coefficient), but the computing cost required to perform the operation rises with each reduction in error.

Let's perform a numerical test of the accuracy of finite differences by building a Taylor series expansion of a simple univariate function. Using the above equation, it is straightforward to implement the first derivative.

def df1(f, x, h):
    """First derivative"""
    return 0.5 * (f(x+h) - f(x-h)) / h

Higher order derivatives can be computed by repeatedly applying the finite difference rule. For example, this is what it would look like for the second, third, and fourth derivatives:

def df2(f, x, h):
    """Second derivative"""
    return 0.5 * (df1(f, x+h, h) - df1(f, x-h, h)) / h

def df3(f, x, h):
    """Third derivative"""
    return 0.5 * (df2(f, x+h, h) - df2(f, x-h, h)) / h

def df4(f, x, h):
    """Fourth derivative"""
    return 0.5 * (df3(f, x+h, h) - df3(f, x-h, h)) / h

The above implementations can be extended to any derivative order using a recursive function call.

def df(f, x, h, o):
    """o-th order derivative"""
    if o==0:
        return f(x)
    else:
        return 0.5 * (df(f, x+h, h, o-1) - df(f, x-h, h, o-1)) / h

Now we can implement the calculation of Taylor series expansion coefficients for an arbitrary univariate function 'f(x)', up to a given maximal order 'maxo'

def taylor_coefs(f, x, h, maxo):
    """Taylor series coefficients"""
    c = [f(x)]
    fac = 1.0
    for o in range(1, maxo+1):
        fac *= o
        c.append( df(f, x, h, o) / fac )
    return c

Finally, let's test the accuracy of finite difference derivatives by computing a Taylor series expansion of 'sin(x)' about 'x=x0' truncated at different orders.