Earlier, we learnt how to design our own reverse-mode automatic differentiator and train a basic neural network to represent potential energy data for the ground electronic state of water molecule. In this exercise, our task will be to use the available auto-differentiating frameworks, such as, for example, JAX, to build and compare the neural-network and Taylor-series representations of potential energy data for water molecule.
The potential energy data is stored in an HDF5 file
for different values of internal coordinates, which are the interatomic distances $r_1$ (O—H$_1$) and $r_2$ (O—H$_2$) in Angstroms and valence bond angle $\alpha$ (H$_1$—O—H$_2$) in degrees. The potential energy is in units of cm$^{-1}$. Here is how we can read the data
import h5py
with h5py.File("water_PES_data.h5", 'r') as fl:
r1, r2, alpha = fl['x'][...].T
enr = fl['E'][...]
data = np.vstack((r1, r2, alpha*np.pi/180.0, enr)).T # convert angle to radians
data.shape
# coordinate values corresponding to minimal energy
r1min, r2min, amin, emin = min([(r1, r2, alpha, e)
for (r1, r2, alpha, e) in data],
key=lambda x: x[3])
# cut to include data points with energy > 40000 cm^-1
emax = 40000.
data = [(r1, r2, alpha, e) for (r1, r2, alpha, e) in data if e <= emax]
data = np.array(data)
print("reference (minimum) values of r1, r2, alpha coordinates:", r1min, r2min, amin)
print("number of data points:", data.shape[0])
reference values of r1, r2, alpha coordinates: 0.9673469387755103 0.9673469387755103 1.7951958020513104
number of data points: 66967
Before proceeding into learning the basics of the JAX library and building the neural-network representation for potential, let's try to fit simple Taylor series expansion to potential energy data by solving a linear regression problem.
A common approach is to represent potential surface as a sum of products of simple functions, each depending on a single coordinate, with linear coefficients representing the fitting parameters, e.g., for water
$$ V(r_1,r_2,\alpha) = \sum_{p_1,p_2,p_3}C_{p_1,p_2,p_3}f^{p_1}(r_1)g^{p_2}(r_2)h^{p_3}(\alpha), $$
where $p_1,p_2,p_3$ are different expansion powers and $f,g$, and $h$ are some univariate functions. For example, $f(r_1)=r_1-r_0$, $g(r_2)=r_2-r_0$, and $h(\alpha)=\alpha-\alpha_0$$r_0$ where $r_0$ and $\alpha_0$ are some reference values of coordinates. The unknown coefficients $C_{p_1,p_2,p_3}$ can be found by minimizing a loss function $\mathcal{L}$, for example, a mean square error
$$ \mathcal{L}=\sum_i^N\left(U_i - V(r_{1,i},r_{2,i},\alpha_i)\right)^2\rightarrow 0, \\ \frac{\partial\mathcal{L}}{\partial C_{q_1,q_2,q_3}} = -2\sum_i^N\left(U_i - V(r_{1,i},r_{2,i},\alpha_i)\right)\left.\frac{\partial V}{\partial C_{q_1,q_2,q_3}}\right|{r{1,i},r_{2,i},\alpha_i} = \\ = -2\sum_i^N\left(U_i - \sum_{p_1,p_2,p_3}C_{p_1,p_2,p_3}f^{p_1}(r_{1,i})g^{p_2}(r_{2,i})h^{p_3}(\alpha_i)\right)f^{q_1}(r_{1,i})g^{q_2}(r_{2,i})h^{q_3}(\alpha_i) = 0 $$
where $\left\{r_{1,i},r_{2,i},\alpha_i,U_i\right\}_{i=1}^N$ denote potential energy data points. The above equation can be rewritten in matrix form
$$ M\cdot C = V \\ M_{kl} = \sum_{i}^N \left[f^{p_1}(r_{1,i})g^{p_2}(r_{2,i})h^{p_3}(\alpha_i)\right]\left[f^{q_1}(r_{1,i})g^{q_2}(r_{2,i})h^{q_3}(\alpha_i)\right] \\ V_{k} = \sum_{i}^N \left[f^{p_1}(r_{1,i})g^{p_2}(r_{2,i})h^{p_3}(\alpha_i)\right]U_i $$
where $k$ and $l$ are power multi-indices $k=(p_1,p_2,p_3)$ and $l=(q_1,q_2,q_3)$.
A code that builds and solves the system of linear equations and determines the unknown coefficients can look like this
import itertools
# list of expansion powers
powers = [[0,1,2,3,4,5,6,7,8], [0,1,2,3,4,5,6,7,8], [0,1,2,3,4,5,6,7,8]]
maxpow = 8
pow_ind = np.array([elem for elem in itertools.product(*powers)
if sum(elem) <= maxpow
])
# data points
r1, r2, alpha = data[:,:3].T
U = data[:,3:]
# producs of simple univariate functions
xk = lambda r1, r2, alpha: np.array([
(r1-r1min)**p1 \\
* (r2-r2min)**p2 \\
* (alpha-amin)**p3
for (p1, p2, p3) in pow_ind
])
# solve C = M^-1 * V
M = np.dot(xk(r1, r2, alpha), xk(r1, r2, alpha).T)
V = np.dot(xk(r1, r2, alpha), U)
coefs = np.linalg.solve(M, V)
# model function
model = lambda r1, r2, alpha, coefs: np.dot(xk(r1, r2, alpha).T, coefs)
# RMS error
rms = np.sqrt(np.mean(np.square(model(r1, r2, alpha, coefs) - U)))
print(rms)
23.14195581125594
Here we used 'itertools.product' function to generate all possible combinations of expansion powers for different coordinates, with the total expansion power truncated at sixth order.