Questions tagged [variance]
122 questions
31
votes
5 answers
Why underfitting is called high bias and overfitting is called high variance?
I have been using terms like underfitting/overfitting and bias-variance tradeoff for quite some while in data science discussions and I understand that underfitting is associated with high bias and over fitting is associated with high variance. But…
Vaibhav Thakur
- 2,333
- 3
- 11
- 9
23
votes
3 answers
What is the meaning of term Variance in Machine Learning Model?
I am familiar with terms high bias and high variance and their effect on the model.
Basically your model has high variance when it is too complex and sensitive too even outliers.
But recently I was asked the meaning of term Variance in machine…
Sociopath
- 1,223
- 2
- 11
- 27
9
votes
2 answers
RL Advantage function why A = Q-V instead of A=V-Q?
In RL Course by David Silver - Lecture 7: Policy Gradient Methods, David explains what an Advantage function is, and how it's the difference between Q(s,a) and the V(s)
Preliminary, from this post:
First recall that a policy $\pi$ is a mapping…
Kari
- 2,686
- 1
- 17
- 47
9
votes
3 answers
How to estimate the variance of regressors in scikit-learn?
Every classifier in scikit-learn has a method predict_proba(x) that predicts class probabilities for x. How to do the same thing for regressors?
The only regressor for which I know how to estimate the variance of the predictions is Gaussian process…
Vladislav Gladkikh
- 1,086
- 9
- 18
8
votes
1 answer
Question on bias-variance tradeoff and means of optimization
So I was wondering how does one, for example, can best optimize the model they are trying to build when confronted with issues presented by high bias or high variance. Now, of course, you can play with the regularization parameter to get to a…
Zer0k
- 155
- 5
8
votes
3 answers
Overfitting Naive Bayes
My question is what are potential reasons for Naive Bayes to perform well on a train set but poorly on a test set?
I am working with a variation of the 20news dataset. The dataset has documents, which are represented as "bag of words" with no…
Atte Juvonen
- 323
- 2
- 5
- 8
6
votes
3 answers
What are bias and variance in machine learning?
I am studying machine learning, and I have encountered the concept of bias and variance. I am a university student and in the slides of my professor, the bias is defined as:
$bias = E[error_s(h)]-error_d(h)$
where $h$ is the hypotesis and…
J.D.
- 841
- 4
- 15
- 29
5
votes
2 answers
Elimination of features based on high covariance without affecting performance?
I ran into a question where the answer ran me into a big doubt.
Suppose we have a dataset $A=${$x1,x2,y$} in which $x1$ and $x2$ are our features and $y$ is the label.
Also, suppose that the covariance matrix between these three random variables are…
user307393
- 51
- 1
5
votes
2 answers
Bagging vs Boosting, Bias vs Variance, Depth of trees
I understand the main principle of bagging and boosting for classification and regression trees. My doubts are about the optimization of the hyperparameters, especially the depth of the trees
First question: why we are supposed to use weak learners…
K.Hua
- 153
- 6
5
votes
1 answer
Variance in statistics vs machine learning
In basic statistics, variance is a measure the variability of the data about its mean. In machine learning, variance is a measure of learning the training data too well/capturing the noise in the data/oversensitivity to the small local fluctuations…
MAA
- 151
- 1
5
votes
2 answers
Evaluation of regression models with different evaluations (MSE, variance, VAF etc.)
When comparing several regression models in terms of quality, it seems like most have agreed on the MSE.
There are also papers comparing "variance" and "variance accounted for (VAF)".
However, there seems to be a controversial opinion about the…
MerklT
- 183
- 3
5
votes
2 answers
How to decide what threshold to use for removing low-variance features?
How to decide what threshold to use for removing low-variance features?
Particularly, I have 100000 features and the variances look like:
Could I e.g. take the average and use it to split this to ~half?
Or some other method of grouping?
mavavilj
- 416
- 1
- 3
- 12
4
votes
2 answers
Trade off between Bias and Variance
What are the best ideas or approaches to trade off between bias and variance in Machine Learning models.
deepguy
- 1,441
- 7
- 18
- 38
4
votes
2 answers
How can I calculate mean and variance incrementally?
Say I have a set S of values, and want to store in a database some summary information about that set, so that later when I acquire a new value v I can make a reasonable estimate of what the summary information would be about the set S ∪ {v} ---…
dubiousjim
- 181
- 6
4
votes
2 answers
How do you set sigma for the Gaussian similarity kernel?
Let's say we have $n$ two-dimensional vectors: $$\mathbf{x}_1,\dots,\mathbf{x}_i,\dots,\mathbf{x}_n=(x_{1_1},x_{1_2})^T,\dots,(x_{i_1},x_{i_2})^T,\dots,(x_{n_1},x_{n_2})^T$$ How do you set $\sigma$ for the Gaussian similarity…
Diego Sacconi
- 45
- 1
- 6