Some curves in math are just… forgettable.

And then there is this one, the perfect, symmetric bell shape that appears everywhere: The Gaussian distribution

Most of us first see it in the stats class. A neat diagram, a quick explanation that “things cluster around the average,” and then, without much explanation, the formula:

$$ f(x) = \frac{1}{\sigma \sqrt{2\pi}} \, e^{ -\frac{(x - \mu)^2}{2\sigma^2} } $$

It is taught like a finished product. We rarely think about where it came from. It’s just there, like gravity or the alphabet.

How can someone look at a bunch of points and come up with a formula that contains π, e, and a few constants all multiplied together?

It was built step by step, through logic. And interestingly, the story doesn’t start with Gauss. It begins almost a century earlier with a mathematician trying to solve problems about gambling.

The Prequel: De Moivre and the First Glimpse of the Curve

Long before Gauss, in the 1730s, the French mathematician Abraham de Moivre was studying the binomial distribution. On problems like flipping a fair coin 100 times. The binomial formula tells you the exact probability of getting, say, 60 heads.

The problem was that these calculations became hard for a large number of flips. De Moivre noticed a pattern: when he plotted the probabilities for many trials, the discrete bars of the binomial distribution began to form a smooth, symmetrical, bell-like shape.

He searched for a continuous function that could approximate this shape. Using a clever technique known as Stirling’s approximation to handle the big factorials involved, De Moivre derived an approximation that matches the Gaussian curve. For him, it was a tool to approximate the binomial distribution.

He had found the shape, but its deeper, universal significance was yet to be revealed.

binomial_approximation.png

The Main Event: Gauss and the Errors in the Stars

Fast forward to the early 1800s. Gauss was facing a very practical problem: predicting the orbits of celestial bodies from observational data with small measurement errors.

So, which measurement is the “true” one?

Gauss started with a few common-sense assumptions about how errors should work: