What is the use of additional column of 1s in normal equation?

Question

Currently I am going through Normal Equation in Machine Learning.

$$ \hat\theta = (X^T \cdot X)^{-1} \cdot X^T \cdot y $$

But when I see how they use this equation, I found they always add an additional column of 1s in the starting of matrix X before transposing.

I don't understand why. What's the logic behind this?

The places where I found such things

1) Coursera - Theory

2) Implementation

Now let’s compute using the Normal Equation. We will use the inv() function from NumPy’s Linear Algebra module (np.linalg) to compute the inverse of a matrix, and the dot() method for matrix multiplication:

X_b = np.c_[np.ones(( 100, 1)), X] # add x0 = 1 to each instance

Géron, Aurélien. Hands-On Machine Learning with Scikit-Learn and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems (p. 111). O'Reilly Media. Kindle Edition.

They represent intercept term, also called bias. – Green Falcon Feb 18 '18 at 07:30 — Green Falcon, Feb 18 '18 at 07:30

score 11 · Accepted Answer · answered Feb 18 '18 at 07:30

11

The normal equations are designed such that each coefficient in the model has an input of some kind it's being multiplied against. The column of ones is the "input" to the intercept term.

answered Feb 18 '18 at 07:30

David Marx

3,188
13
23

I have a follow up question. Are we adding a column of ones because of a convention? I'm asking, because it looks like any additional vector, not being a zero vector, would work. – Marek M. Oct 25 '22 at 18:27

score 1 · Answer 2 · answered Aug 29 '19 at 14:58

1

This is done to accommodate the bias. While finding the hypothesis function the bias or the y-intercept part is multiplied with 1 (column of 1) in the matrix multiplication.

answered Aug 29 '19 at 14:58

A_the_kunal

31
3

What is the use of additional column of 1s in normal equation?

2 Answers2