Loss function(cost function)

The loss function quantifies our unhappiness with predictions on the training set

convention: the lower loss, the better performance

classification: SVM loss and Softmax loss, see Linear Classification

regression:

Regularization

TA Explanation: We add biases to take into account the priors of the input data to that layer. We don’t want to impose any penalty on making this large; if an input always has a lot more red, we want the bias term to take care of that. (Edited from False to True based on https://piazza.com/class/j0vi72697xc49k?cid=1473 )

advantage: