7 Regularization via Objective Function

This chapter covers

Norm-based penalties and their use in regularizing a model estimation process
L2-norm-based penalty in ridge regression and its implementation
L1-norm-based penalty in LASSO regression and its implementation
Derivation of closed-form solutions for both ridge and LASSO regression
Empirical analysis on the effect of different regularization schemes in training feed-forward neural networks

The objective function is the final stop of the forward pass when training a neural network. Also referred to as the cost function, the objective function determines the quality of prediction based on the current parameter configuration of the network. It guides the weight updates in the next iteration via gradient descent. By summarizing the prediction quality for all observations, the objective function offers a single metric that summarizes the amount of error for downstream optimization. That single metric is small in value if the predictions are close to the target values (on average) and large otherwise.

7.1 Introducing the regularization term

7.1.1 The unregularized linear regression

7.1.2 Norm-based penalty

7.2 L2 regularization in ridge regression

7.2.1 Using the analytic solution

7.2.2 Using gradient descent algorithm

7.2.3 Handling the bias term

7.2.4 L2 regularization in action

7.3 Sparse estimation via LASSO

7.3.1 Geometric interpretation of ridge regression

7.3.2 Introducing the L0 norm

7.3.3 Introducing the L1 norm

7.3.4 Understanding LASSO

7.3.5 The soft-thresholding rule

7.3.6 LASSO in action

7.4 Summary