I am having difficulty to understand the expected squared errors formula in this website:
$y=f(x)+e$ true regression line
$\hat{y}=\hat{f}(x)$ your estimated regression line
$error(x)=\bigg(\mathbb{E}[\hat{f}(x)]-f(x)\bigg)^2+\mathbb{E}\bigg[(\hat{f}(x)-\mathbb{E}[\hat{f}(x)]\bigg]^2+var(e) =bias^2+variance+\text{irreducible error}$
my question is what is the best way to understand bias and variance from this mathematical equation?
my understanding of variance here is:
assume you collect a sample with 1000 records and you generate regression line based on this sample. You get a model: $\hat{f}(x)=1+2x$
so at a given x value, for example: $x=1$ then $\hat{f}(x)=1+2*1=3$
then you re-sample another 1000 records and generate regression line:
$\hat{f}(x)=2+3x$
when $x=1$, $\hat{f}(x)=5$
you repeat this process for N times (assume N is a large number and close to infinity)
then your $E[\hat{f}(x=1)]$ is $(3+5+\dots+N)/N$
assume your $E[\hat{f}(x=1)]$ is $5.5$
once you have calculated $\mathbb{E}[\hat{f}(x)]$ for a given value of x=1, you should be able to compute $\mathbb{E}\bigg[\hat{f}(x)-\mathbb{E}[\hat{f}(x)])^2\bigg]$
$((3-5.5)^2+(5-5.5)^2+...)/(n-1)$
my interpretation above is focusing on variance of $\hat{Y}(x)$ at a each given $x$-point(re sampling to get $\hat{y}(x)$)
but my instinct tells me I should calculate based on only one set of data (1000 records) and calculate $\mathbb{E}[\hat{Y}(x)]$ at different x point.
please let me know which one of interpretation is correct or if both are wrong, please explain using this formula how I should interpret the terms in this formula.