Context - I am not a physicist. I only know school-level physics. I am writing this blog, so that I can provide enough fundamental knowledge to understand Hopfield Network and Boltzmann Machine. These two are the models of Artificial Intelligence but built upon the concepts of Statistical Mechanics (Stat Mech). Since, I only know basic physics, I will only mention my intuitive understanding of Stat Mech and will provide links to good resources I have gone through to understand the concepts. This blog is a short story (a small part of bigger story of stat mech and physics) which brings together only the relevant concepts (characters) for our study in a coherent manner right from the basics without going into details of each concepts. I will only mention the intuitive understanding of the concepts and all the details are delegated to the corresponding links. So, this blog is more of resource sharing blog than a rigorous one. This blog is an uncharted trajectory for me. It took me some time to write every technical term in it. Thermodynamics is a difficult subject. This blog is not a stand-alone blog but rather written just to explain some specific concepts which is used in AI models. Concepts for which this blog is written are - Lattice model, energy function, use of Monte Carlo, equilibrium, temperature, Boltzmann Distribution. Here, I may describe/define some terms or concepts but the description may not be the real definition but true for our purpose.

graph TD
  Statistical-Mechanics --> Thermodynamics
  Statistical-Mechanics --> Boltzmann-Distribution
  Boltzmann-Distribution --> Ising-Model
  Ising-Model --> Spin-glass-model
  Spin-glass-model --> Hopfield-Network
  Spin-glass-model --> Boltzmann-Machine

Physics

Physics is the study of matter. Matter is anything that has mass and takes volume. From everyday objects we deal with like chair to atoms is a matter. Everything is made up of matter. Physics help us understand this “everything” (ref, ref).

One of the branch of physics is “classical mechanics”. It help us understand the motion of everyday objects which are not too large and not too small and have speed which is much smaller than the speed of light. It help us predict how much distance that object will travel, in which direction, speed of that object after certain amount of time etc. (ref)

Intuitive Intro to Thermodynamics

Thermodynamics is another branch of Physics which study behavior associated with the internal motion of macroscopic system like solid, liquid, gas etc. Macroscopic system is a system with number of particles/atoms greater than equal to Avogadro number ($N_A \simeq 6.02 \times 10^{23}$). It involves studying internal motion rather than a motion of a system as a whole. To study motion in physics, a typical approach would be first define an initial state and then write exact set of equations of motion which would explain the behavior of system, for example exactly determining the path of a ball in a projectile motion. For macroscopic system, it is possible to exactly defining the internal motion by writing equation of motion from classical mechanics and quantum mechanics. But first finding the initial state by finding the initial position and velocity of zillions of particle and then applying these equations is extremely difficult and also not needed. It turns out that behavior of such system depends on the macroscopic properties of the system like temperature and pressure, which only depend on the “average” motion of all the particles of the system. Macroscopic properties are the one which only a collection of particles have and is not associated with a single atom. Behavior of system which is understood through this macro properties is via “laws of thermodynamics”. To put it simply, laws in physics are the statements about the behavior of the system that is observed empirically. For example, consider a simple system of hot coffee in a cup in room temperature. Temperature is basically average kinetic energy of particles in the cup. Second law of thermodynamics tells the direction in which system evolves to, i.e. to put it simply, it evolves in the direction of increasing disorder. Here, temperature of coffee decreases until the surrounding and coffee has the same temperature. That is, gas particles in the surrounding have higher kinetic energy. First law says that energy is conserved that is energy lost by the coffee is the same as energy gained by the surrounding. In summary, thermodynamics explains the behavior of the system by telling whether internal motion of coffee increases or decreases and by how much amount.

“Thermodynamics” only deal with few general macro properties without dealing with the microscopic nature of the system. These properties and laws are so fundamental that they apply to many diverse set of systems. This universality is the strength of “Thermodynamics”. But lots of other properties can only be calculated by a statistical treatment on a microscopic nature of specific systems and that’s where field of stat mech comes in. Stat mech also explains results of thermodynamic from the microscopic point of view.

References:

Statistical Mechanics

Stat mech provides statistical tools to help understand the behavior associated with the internal motion of macroscopic system right from its microscopic nature. With stat mech, one can derive all the thermodynamic results (ref) and also extra results which is specific to the microscopic nature of that system like magnetism for ferromagnetic material (ref, ref). Stat mech provide deeper understanding of thermodynamics. Stat mech is based on the concepts of microstate, macrostate, ensemble, probability distribution etc.

Let us first define microstate and macrostate. Microstate is a specific configuration of a macrosystem that contain the exact location and momentum of all its particles. Macrostate is the state of macroscopic system determined by its macro properties like temperature, pressure etc. There are many microstates which corresponds to a single macrostate. For example for a particular temperature, a system of gas molecule can be in many microstates, all having the same temperature (ref).

As I mentioned in thermo para, it is difficult to exactly determine the microstate. Stat Mech deal with this uncertainty by considering a probability distribution over microstates. Such probability distribution in stat mech is called “statistical ensemble” or simply “ensemble”. Consider ensemble as different system states a experimenter would find system in when doing multiple experiments in the lab (ref). There are three different types of ensembles considered in stat mech depending on what variables are fixed; microcanonical ensemble, canonical ensemble and grand canonical ensemble. They become equivalent in the thermo limit. Canonical ensemble is used in Ising model because calculations are easier (ref_in_Gould?). Canonical ensemble is an ensemble when system is in thermal equilibrium with the heat bath. Consider heat bath as large surrounding with constant temperature (ref_in_todo). In thermal equilibrium, system can exchange energy (heat—first law) with the heat bath therefore different microstate have different energies (ref, ref, ref_susskind_lec8). Systems energy fluctuates with time. There is some randomness in the system energy and state depends on the energy, hence system’s state is random. And we can only determine system’s state up to some probability distribution. (ref)

Probability distribution in canonical ensemble is given by the Boltzmann distribution. Boltzmann distribution is based on the principle of maximum entropy. Acc. to it, given the temperature, states with lower energy have higher probability and vice versa. But as temperature grows, all states becomes increasingly equally probable.