Currently I am going through Normal Equation in Machine Learning.
$$ \hat\theta = (X^T \cdot X)^{-1} \cdot X^T \cdot y $$
But when I see how they use this equation, I found they always add an additional column of 1s in the starting of matrix X before transposing.
I don't understand why. What's the logic behind this?
The places where I found such things
1) Coursera - Theory
2) Implementation
Now let’s compute using the Normal Equation. We will use the
inv()function from NumPy’s Linear Algebra module (np.linalg) to compute the inverse of a matrix, and thedot()method for matrix multiplication:
X_b = np.c_[np.ones(( 100, 1)), X] # add x0 = 1 to each instanceGéron, Aurélien. Hands-On Machine Learning with Scikit-Learn and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems (p. 111). O'Reilly Media. Kindle Edition.