We can apply the ideas from "Learning an Accurate Physics Simulator via Adversarial Reinforcement Learning" to cell simulation with electrical cell models, such as Thevenin equivalent circuit model:

  1. Make all model parameters (e. g., resistances and capacitances in an equivalent circuit model (ECM)) learned functions of the state (State-of-charge, temperature, etc.). These dependencies may not exist in physical reality: for example, capacitance parameters in ECM reflect cell's inertia and are completely independent of SoC, but provide an opportunity to compensate for the crudeness of the basic model: capacitances in ECM fail to capture the cell inertia adequately across the full SoC range if the parameters are assumed to be constant.
  2. Learn the parameterX(state) functions (e. g. capacitance1(SoC, T)) using an adversarial network that discriminates cell state "trajectories" simulated with the model from real trajectories.

State-of-charge estimators (for example, based on Kalman filter, or on an adaptive extended H-infinity filter) can use the resulting model instead of the original Thevenin equivalent circuit model.

In the article linked above (concerned with improving the simulation of a robot's environment), the learned parameter(state) functions in the model are networks themselves (at least they are depicted as feed-forward networks on the illustrations), in the battery setting I guess they can be even not networks but just linear combinations, e. g. capacitance1 = C1_0 + a*SoC + b*T, where C1_0, a, and b are learned, or second-degree multivariate polynomial functions, such as capacitance1 = C1_0 + a*SoC + b*T + c*SoC^2 + d*T^2 + e*SoC*T. Thus, we can easily move the resulting equivalent circuit model such components to C code without any ML libraries.

The most difficult part of this approach is accumulating sufficiently diverse real cell state trajectories

To make this "model tuning" minimally robust, the adversarial network should use real cell state trajectories collected at a wide range of temperatures from cells with different state-of-health.

Warning: the outputs of parameter functions must not be interpreted

These parameter values must not be used to Estimate most cell parameters at once to Estimate the risk of cell failure, and ideally should not even exit the cell simulation model (except for debugging), so that nobody is tempted to interpret them. The original article illustrates how these parameter functions can deliberately "game" the model to make its output more realistic:

In our earlier mattress example, the learnable hybrid simulator is able to mimic the contact forces from the mattress. Because the learned contact parameters are state-dependent, the simulator can modulate contact forces based on the distance and velocity of the robot’s feet relative to the mattress, mimicking the effect of the stiffness and damping of a deformable surface.

The resulting equivalent circuit model could only be used in estimators of cell parameters (such as Stochastic estimator of cell parameters based on an electrical cell model or Estimator of cell parameters based on particle swarm optimisation) in the following way: in the learned parameter functions, we fix the baseline parameter values (C1_0 in the example above) and optimise only the coefficients for the state-dependent components: a and b. Then, the resulting cell simulator has the baseline cell parameter values "resurfaced" as model parameters again, like the vanilla Thevenin equivalent circuit model. Estimators of cell parameters can use the resulting model instead of the "original" equivalent circuit model.