Questions tagged [goodness-of-fit]
7 questions
3
votes
1 answer
SAS Studio seems to imply that apparently non-normal data is normal
I have some data I'm trying to analyze in SAS Studio (university edition). I am using the Distribution Analysis feature to try to test some data for normality.
It gives me the following histogram:
Skewness is approximately 2.934 and Kurtosis is…
EJoshuaS - Stand with Ukraine
- 193
- 1
- 9
2
votes
0 answers
Multi-dimensional Euclidian R^2 squared - reasonable?
I have a high-dimensional space, say $\mathbb{R}^{1000}$, and I have samples $y_1, \ldots , y_n \in \mathbb{R}^{1000}$ and $\hat{y}_1, \ldots , \hat{y}_n \in \mathbb{R}^{1000}$. Would $$ R^2 = 1 - \frac{\sum_i || y_i - \hat{y}_i||^2}{\sum_i || y_i -…
AspiringToAspire
- 21
- 1
2
votes
1 answer
Why Should There Be Multiple Columns in Train Labels for One Model?
Going through the notebook on well known kaggle competition of favorita sales forecasting.
One puzzle is, after the data is split for train and testing, it seems y_train has two columns containing unit_sales and transactions, both of which are being…
Della
- 315
- 1
- 3
- 9
1
vote
1 answer
Does statsmodels compute R2 and other metrics on a validation-/test- set?
Does statsmodels compute R2 and other metrics on a validation set?
I am using the OLS from the statsmodels.api
when printing summary, an r2 and r2_asjusted are presented.
I did not trust those 0.88 and computed an own adjusted R2 with scikit-learn…
Alexander Vocaet
- 269
- 2
- 9
0
votes
0 answers
Scipy kstest problem
I am fitting mixture models to data and assessing how mixtures with more or less components will fit the data. To do this, I am going to plot the cdf of the empirical data and the cdf of my mixture model with k components. As an example, here is a…
0
votes
0 answers
Interpreting chi-square statistic values and also scipy.stats.chisquare giving unreasonable value
I have some data from the S&P500 of daily returns. I'm not sure if I can show my graph, as I will be using it in my undergraduate paper, but it looks essentially the same as the histogram here:
Just like in the figure above, I am attempting to fit…
0
votes
1 answer
Goodness on test or train set?
I split my data set before on train (80%) and test (20%) splits. Trained logistic regression model on the train set. Now, want to check the goodness of fit using the Chi-square likelihood omnibus test, on what data set I should apply it to test or…
Bambeil
- 111
- 4