What is Gradient Descent?

Gradient descent is an algorithm for minimizing any function, not just linear regression cost functions. It's one of the most important building blocks in machine learning, used to train everything from simple linear regression to advanced deep learning neural networks.

The Goal

Minimize the cost function $J(w, b)$ by finding the optimal values of parameters $w$ and $b$.

More generally, gradient descent can minimize functions with many parameters: $J(w_1, w_2, dots, w_n, b)$.

How Gradient Descent Works

Starting Point

The Process

  1. Look around — Evaluate which direction decreases $J$ the most.
  2. Take a small step — Move parameters slightly in that direction.
  3. Repeat — Keep adjusting until $J$ reaches a minimum.

The Hill Analogy

Imagine standing on a hilly landscape where:

At each step: