Earlier, we learnt how to design our own reverse-mode automatic differentiator and train a basic neural network to represent potential energy data for the ground electronic state of water molecule. In this exercise, our task will be to use the available auto-differentiating frameworks, such as, for example, JAX, to build and compare the neural-network and Taylor-series representations of potential energy data for water molecule.

The potential energy data is stored in an HDF5 file

for different values of internal coordinates, which are the interatomic distances $r_1$ (O—H$_1$) and $r_2$ (O—H$_2$) in Angstroms and valence bond angle $\alpha$ (H$_1$—O—H$_2$) in degrees. The potential energy is in units of cm$^{-1}$. Here is how we can read the data

import h5py

with h5py.File("water_PES_data.h5", 'r') as fl:
    r1, r2, alpha = fl['x'][...].T
    enr = fl['E'][...]

data = np.vstack((r1, r2, alpha*np.pi/180.0, enr)).T # convert angle to radians
data.shape

# coordinate values corresponding to minimal energy
r1min, r2min, amin, emin = min([(r1, r2, alpha, e)
                                for (r1, r2, alpha, e) in data],
                               key=lambda x: x[3])

# cut to include data points with energy > 40000 cm^-1
emax = 40000.
data = [(r1, r2, alpha, e) for (r1, r2, alpha, e) in data if e <= emax]
data = np.array(data)

print("reference (minimum) values of r1, r2, alpha coordinates:", r1min, r2min, amin)
print("number of data points:", data.shape[0])

reference values of r1, r2, alpha coordinates: 0.9673469387755103 0.9673469387755103 1.7951958020513104
number of data points: 66967

Before proceeding into learning the basics of the JAX library and building the neural-network representation for potential, let's try to fit simple Taylor series expansion to potential energy data by solving a linear regression problem.

Linear regression

A common approach is to represent potential surface as a sum of products of simple functions, each depending on a single coordinate, with linear coefficients representing the fitting parameters, e.g., for water

$$ V(r_1,r_2,\alpha) = \sum_{p_1,p_2,p_3}C_{p_1,p_2,p_3}f^{p_1}(r_1)g^{p_2}(r_2)h^{p_3}(\alpha), $$

where $p_1,p_2,p_3$ are different expansion powers and $f,g$, and $h$ are some univariate functions. For example, $f(r_1)=r_1-r_0$, $g(r_2)=r_2-r_0$, and $h(\alpha)=\alpha-\alpha_0$$r_0$ where $r_0$ and $\alpha_0$ are some reference values of coordinates. The unknown coefficients $C_{p_1,p_2,p_3}$ can be found by minimizing a loss function $\mathcal{L}$, for example, a mean square error

$$ \mathcal{L}=\sum_i^N\left(U_i - V(r_{1,i},r_{2,i},\alpha_i)\right)^2\rightarrow 0, \\ \frac{\partial\mathcal{L}}{\partial C_{q_1,q_2,q_3}} = -2\sum_i^N\left(U_i - V(r_{1,i},r_{2,i},\alpha_i)\right)\left.\frac{\partial V}{\partial C_{q_1,q_2,q_3}}\right|{r{1,i},r_{2,i},\alpha_i} = \\ = -2\sum_i^N\left(U_i - \sum_{p_1,p_2,p_3}C_{p_1,p_2,p_3}f^{p_1}(r_{1,i})g^{p_2}(r_{2,i})h^{p_3}(\alpha_i)\right)f^{q_1}(r_{1,i})g^{q_2}(r_{2,i})h^{q_3}(\alpha_i) = 0 $$

where $\left\{r_{1,i},r_{2,i},\alpha_i,U_i\right\}_{i=1}^N$ denote potential energy data points. The above equation can be rewritten in matrix form

$$ M\cdot C = V \\ M_{kl} = \sum_{i}^N \left[f^{p_1}(r_{1,i})g^{p_2}(r_{2,i})h^{p_3}(\alpha_i)\right]\left[f^{q_1}(r_{1,i})g^{q_2}(r_{2,i})h^{q_3}(\alpha_i)\right] \\ V_{k} = \sum_{i}^N \left[f^{p_1}(r_{1,i})g^{p_2}(r_{2,i})h^{p_3}(\alpha_i)\right]U_i $$

where $k$ and $l$ are power multi-indices $k=(p_1,p_2,p_3)$ and $l=(q_1,q_2,q_3)$.

A code that builds and solves the system of linear equations and determines the unknown coefficients can look like this

import itertools

# list of expansion powers
powers = [[0,1,2,3,4,5,6,7,8], [0,1,2,3,4,5,6,7,8], [0,1,2,3,4,5,6,7,8]]
maxpow = 8
pow_ind = np.array([elem for elem in itertools.product(*powers)
                    if sum(elem) <= maxpow
                   ])

# data points
r1, r2, alpha = data[:,:3].T
U = data[:,3:]

# producs of simple univariate functions
xk = lambda r1, r2, alpha: np.array([
                                     (r1-r1min)**p1 \\
                                     * (r2-r2min)**p2 \\
                                     * (alpha-amin)**p3
                                     for (p1, p2, p3) in pow_ind
                                    ])

# solve C = M^-1 * V
M = np.dot(xk(r1, r2, alpha), xk(r1, r2, alpha).T)
V = np.dot(xk(r1, r2, alpha), U)
coefs = np.linalg.solve(M, V)

# model function
model = lambda r1, r2, alpha, coefs: np.dot(xk(r1, r2, alpha).T, coefs)

# RMS error
rms = np.sqrt(np.mean(np.square(model(r1, r2, alpha, coefs) - U)))

print(rms)

23.14195581125594

Here we used 'itertools.product' function to generate all possible combinations of expansion powers for different coordinates, with the total expansion power truncated at sixth order.

Linear regression

Basics of JAX