Highest Voted 'anova' Questions - Data Science Stack Exchange

2

votes

1 answer

Score of ANOVA in selected features

I selected features using ANOVA (because I have Numerical data as input and Categorical data as target): anova = SelectKBest(score_func=f_classif, k='all') anova.fit(X_train, y_train.values.argmax(1)) # y_train.values.argmax(1) because I already…

feature-selection score anova

asked Jun 19 '21 at 12:55

Mimi

45
7

2

votes

1 answer

What conclusion can I get when the variable is influenced by other but there isn't any correlation?

I am doing an analytic exploratory analysis. If the target is a continuous variable and the attributes are all categorical (discrete values), in order to know if exist any influence on the target from the each attribute I am doing the ANOVA-test…

statistics correlation anova exploratory-factor-analysis

asked Jul 30 '20 at 15:40

Tlaloc-ES

337
1
6

1

vote

0 answers

Levene test for equal variance

I would like to run one-way ANOVA test on my data. I saw that one of several assumptions for one-way ANOVA is that there needs to be homogeneity of variances. I have run the test for different data-sets. I find sometimes my p-values are larger than…

variance anova pvalue

asked Jan 13 '21 at 08:53

Reut

349
2
13

1

vote

1 answer

When should mutual information be used for feature selection over other feature selection methods like correlation, ANOVA , etc?

I have a data set with categorical and continuous/ordinal explanatory variables and continuous target variable. I tried to filter features using one-way ANOVA for categorical variables and using Spearman's correlation coefficient for…

machine-learning feature-selection mutual-information anova spearmans-rank-correlation

asked Jun 17 '20 at 00:01

Ankita Talwar

307
1
10

1

vote

1 answer

Question on ANOVA and Correlation/Association

I've been working on examining statistical relationships between variable: Pearsons, Spearman's for continuous variables Kendall's Tau, Cramer's V for ordinal/nominal variables. I know there's many more ways. Recently I read about ANOVA and…

statistics correlation statsmodels anova

asked May 04 '20 at 21:29

rocksNwaves

309
1
10

1

vote

2 answers

Are Chi-square and ANOVA (f_classif) to select best features?

I have a binary classification problem (target 0 o 1), I have both variables continuous and categorical as features. I understood that about Chi-square i can use only categorical features to evaluate them. What about ANOVA (f_classif)? It's the…

machine-learning python data-science-model chi-square-test anova

asked Dec 24 '22 at 11:46

SimoneA

41
3

0

votes

1 answer

pass variable length argument to mstats.kruskalwallis

I am trying to run kruskawallis test on multiple columns of my data for that i wrote an function var=['a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v','w','x','y','z'] def kruskawallis_test(column): …

machine-learning statistics non-parametric anova

asked May 20 '20 at 06:06

Ayush Ranjan

401
1
4
13

0

votes

1 answer

What does it mean to have 1 degree of freedom in ANOVA test?

So I used python to run multi-factorial ANOVA analysis on a data set. I first used a ols.fit() and then the anova_lm function. I realized for the variables I am analyzing their degree of freedom is 1. Does that mean only 1 value out of my data is…

python anova

asked May 07 '20 at 06:35

kalone_mevin

3
2

0

votes

0 answers

How to check for correlations using ANOVA between the selected feature columns and the target feature column

Suppose I have a set of feature columns that I select to check for correlations with the target feature 'is_Star'; which is of binary values and the former consists of numeric values Why do i end up getting an array of NAN values? Also, I have…

python classification pandas anova

asked Jul 06 '23 at 22:11

meKafka

1
1

0

votes

0 answers

Hyunh-Feldt vs Growth Curve Analysis

Are there cases where the Hyunh Feldt and Greenhouse Geisser correction F tests are less conservative than the Conservative Growth Curve approach for ANOVA analysis? The textbook Design of Experiments in R has a problem in which we find that…

anova

asked May 13 '23 at 23:35

Lyle

1

0

votes

0 answers

Why do you need more subjects than levels in RM ANOVA?

Honestly, I'm hoping I even asked this question appropriately. It's one of those things I never quite grasped, but just accepted to be true, that when running a within-subjects ANOVA, you have to have a lot of subjects or otherwise you can't run the…

r statistics data-analysis mathematics anova

asked Feb 14 '23 at 17:47

mbeasle2

1

0

votes

0 answers

Statistical significance on aggregate data to show that the groups are different?

I am working with performance data for three groups for each region. The denominator for the groups is the number of people who are identified as low performers. For region A, Group-1 low performer %= 40% , group-2= 30% , group-3 low performer= 30%.…

statistics inference anova

asked Feb 12 '23 at 17:59

user728148

21
1
3

0

votes

0 answers

Different runs of One Way ANOVA give different results in Python (on the same data, using stats.f_oneway)

I wrote the following code: The data I am running this code on is randomly generated, but I use fixed seeding. Each time I run the code this function gives different results for F statistics and pvalue. What is the reason for this? Is this normal?…

python statistics variance anova pvalue

asked Nov 21 '22 at 00:28

Archil Zhvania

101
1

0

votes

2 answers

What model should I use to predict monthly sales by products?

I am trying to predict monthly sales by product based on a plethora of variables. There are 4 predictors. One is categorical (month) and the other three are numerical. One of the variables is just part sales. The data I am trying to predict is…

time-series predictive-modeling statistics statsmodels anova

asked Nov 09 '22 at 16:14

Lauren

1

0

votes

0 answers

Create box plot with anova model and different p value

I have created box plot which looks like this Code - df = pd.read_csv("file.csv") import plotly.express as px fig = px.box(df, x="Treatment", y="Fresh weight (g)", color='Treatment', facet_col='Crop', title='Fresh Weight', height=750) fig.show() I…

statistics data-analysis anova

asked Sep 14 '22 at 07:19

Jhon Patric

213
1
8

Questions tagged [anova]