I have a binary classification problem (target 0 o 1), I have both variables continuous and categorical as features. I understood that about Chi-square i can use only categorical features to evaluate them. What about ANOVA (f_classif)? It's the same? I only can evaluate best categorical features? Thank you in advance
-
Why do you have to select features at all? – Dave Dec 25 '22 at 03:54
2 Answers
The Chi-square test is a statistical test that is used to determine whether there is a significant difference between the observed frequency of a categorical variable and the expected frequency based on the assumption of independence. It can be used to select the best categorical features for a classification model.
ANOVA (Analysis of Variance) is a statistical test that is used to compare the means of two or more groups. The F-test, also known as the F-value or F-score, is a measure of the statistical significance of the difference between the means of two or more groups. The F-test is used to determine whether there is a significant difference between the means of two or more groups. The ANOVA F-test can be used to select the best continuous features for a classification model. It is not restricted to categorical features and can be used to evaluate the statistical significance of the difference between the means of two or more continuous features.
- 422
- 2
- 3
ANOVA is a collection of methods to analyze the difference between two or more group means. Categorial features (typically called factors) create the groups. ANOVA's focus is on statistical significance, not finding the best features.
- 20,142
- 2
- 25
- 102