4

Why is the regularization parameter not applied to the intercept parameter?

From what I have read about the cost functions for Linear and Logistic regression, the regularization parameter (λ) is applied to all terms except the intercept. For example, here are the cost functions for linear and logistic regression respectively (Notice that j starts from 1):

enter image description here

enter image description here

N.M
  • 181
  • 1
  • 5
  • 1
    The question has been answered [here](https://stats.stackexchange.com/questions/153605/no-regularisation-term-for-bias-unit-in-neural-network) – nimar Jun 07 '20 at 05:36

2 Answers2

4

Regularization tries to reduce the chance of overfitting by reducing the sensitivity to small changes in the input data. This is not as much of an issue for the intercept term (relative to the coefficients) so it is often not included.

Brian Spiering
  • 20,142
  • 2
  • 25
  • 102
1

The main aim of a regularization is to reduce the parameters values which in turn will reduce the effect of the feature associated with that parameter on the models prediction. For example, Since the intercept Theta0 basically doesn't have a feature associated with it, We don't penalize it.

mewbie
  • 109
  • 5