Gradient Descent

What is Gradient Descent?

Gradient descent is an algorithm for minimizing any function, not just linear regression cost functions. It's one of the most important building blocks in machine learning, used to train everything from simple linear regression to advanced deep learning neural networks.

The Goal

Minimize the cost function $J(w, b)$ by finding the optimal values of parameters $w$ and $b$.

More generally, gradient descent can minimize functions with many parameters: $J(w_1, w_2, dots, w_n, b)$.

How Gradient Descent Works

Starting Point

Initialize parameters (commonly $w = 0$ and $b = 0$).
For linear regression, the starting values do not matter much.

The Process

Look around — Evaluate which direction decreases $J$ the most.
Take a small step — Move parameters slightly in that direction.
Repeat — Keep adjusting until $J$ reaches a minimum.

The Hill Analogy

Imagine standing on a hilly landscape where:

Your position = current values of $w$ and $b$
Height = value of cost function $J$
Goal = reach the lowest valley

At each step:

Spin 360° and look around