The cost function tells us how well the model is doing so we can improve it. It measures how far our predictions are from the actual values.
w and b are the parameters of the model—variables you adjust during training to improve performance.
| w | b | Result |
|---|---|---|
| 0 | 1.5 | Horizontal line at y = 1.5 |
| 0.5 | 0 | Line through origin with slope 0.5 |
| 0.5 | 1 | Line with slope 0.5, crossing y-axis at 1 |
Goal: Find values of w and b so the line fits the training data well (passes close to the data points).
For each training example i, compute the difference between prediction and actual value:
$$ \text{error}^{(i)} = \hat{y}^{(i)} - y^{(i)} $$
Squaring ensures all errors are positive and penalizes larger errors more heavily:
$$ \bigl(\hat{y}^{(i)} - y^{(i)}\bigr)^2 $$
Add up squared errors for all m training examples:
$$ \sum_{i=1}^{m} \bigl(\hat{y}^{(i)} - y^{(i)}\bigr)^2 $$