Regularization for intercept parameter

Question

Why is the regularization parameter not applied to the intercept parameter?

From what I have read about the cost functions for Linear and Logistic regression, the regularization parameter (λ) is applied to all terms except the intercept. For example, here are the cost functions for linear and logistic regression respectively (Notice that j starts from 1):

The question has been answered [here](https://stats.stackexchange.com/questions/153605/no-regularisation-term-for-bias-unit-in-neural-network) — nimar, Jun 07 '20 at 05:36

score 4 · Answer 1 · answered Jan 14 '22 at 17:58

4

Regularization tries to reduce the chance of overfitting by reducing the sensitivity to small changes in the input data. This is not as much of an issue for the intercept term (relative to the coefficients) so it is often not included.

answered Jan 14 '22 at 17:58

Brian Spiering

20,142
2
25
102

score 1 · Answer 2 · answered May 08 '20 at 00:43

The main aim of a regularization is to reduce the parameters values which in turn will reduce the effect of the feature associated with that parameter on the models prediction. For example, Since the intercept Theta0 basically doesn't have a feature associated with it, We don't penalize it.

Regularization for intercept parameter

2 Answers2