Regularization

Loss Function

The model training loss function can be adjusted by adding a penalty term to the loss function using regression analysis methods:

Also known as Lease Absolute Shrinkage and Selection Operator.

The penalty term is equal to the square of the coefficients multiplied by a coefficient to control the overall penalty.

If the number of predictors is greater than the number of observations, Lasso will limit predictors as non-zero, even if all predictors are relevant.

Also known as Tikhonov regularization.

The penalty term is equal to the absolute sum the coefficients multiplied by a coefficient to control the overall penalty.

Ridge regression decreases the complexity of a model but does not reduce the number of variables, and is therefore not good for feature reduction.

Model training can be stopped at the optimal bias-variance tradeoff point in between underfitting and overfitting: