Questions tagged [statsmodels]

54 questions
7
votes
2 answers

Why Scikit and statsmodel provide different Coefficient of determination?

First of all, I know there is a similar question, however, I didn't find it so much helpful. My issue is concerning simple Linear regression and the outcome of R-Squared. I founded that results can be quite different if I use statsmodels and…
3
votes
0 answers

How to retrieve results summary from statsmodels GLM with regularization?

I'm trying to fit a GLM to predict continuous variables between 0 and 1 with statsmodels. Because I have more features than data, I need to regularize. statsmodels has very few examples, so I'm not sure if I'm doing this correctly. import…
3
votes
1 answer

Advice on imputing temperature data with StatsModels MICE

This may be a dumb question but I can't figure out how to actually get the values imputed using StatsModels MICE back into my data. I have a dataframe (dfLocal) with hourly temperature records for five neighboring stations (LOC1:LOC5) over many…
3
votes
5 answers

Predicting deterioration of equipment on a production line

Background There is a production line where there is a machine that interacts with some tools. The process goes as such. Machine makes a product Product is moved into the tool [tool here is essentially a box the condition of which deteriorates…
2
votes
0 answers

statsmodels.tsa.holtwinters.ExponentialSmoothing: what do “additive”/“multiplicative” trend and seasonality actually mean?

There are additional concepts of additivity and multiplicativity for trend (trend{“add”, “mul”, “additive”, “multiplicative”, None}) and seasonality (seasonal{“add”, “mul”, “additive”, “multiplicative”, None}) in the Statsmodels implementation [1,…
2
votes
0 answers

How to interpret my logistic regression result?

I'm having a hard time to interpret my result of the logistic regression. I have a few question. Firstly, how can I check if a feature is more important to the others, like that there is a real significane by it. I have an accuray of 0.58 which is…
2
votes
0 answers

Classification and clustering of Time series data of temperature

I have a time series recorded data of temperature. This is what my data looks like: The change in data represents specific event or a class which I would like to detect when new incoming data. Because I cannot label the data manually I cannot do…
2
votes
0 answers

Does statsmodels fully support MultiIndex?

The below code snippet shows how statsmodels seems to flatten MultiIndex tuples by joining them with an underscore "_". import numpy as np import pandas as pd from statsmodels.regression.linear_model import OLS K = 2 N = 10 ERROR_VOL =…
OldSchool
  • 251
  • 1
  • 2
  • 8
2
votes
2 answers

High error Arima model - Python

I have a time series data. It has daily frequency. I want to forecast the data for the next week or month with an ARIMA model. This is a chart of my time series data: First I use the method seasonal_decompose from stats model to check the…
J.C Guzman
  • 151
  • 5
2
votes
0 answers

Some of the p-values are NaN - logistic regression

I am trying to do logisitc regression, but have this issue - some of the p values are NaN model = sm.Logit(y2,X2.astype(float)) result = model.fit() result.summary() Any ideas what to do?
Sim_Demo
  • 41
  • 1
  • 2
2
votes
0 answers

Using an unmerged branch from StatsModels for Anaconda

I am interested in using a statistical test from StatsModels, one that is not available in the default Anaconda StatsModels folder on my drive. See this blog, which as some point mentions, almost casually: ``We can estimate a Two-Step Heckman Model…
2
votes
1 answer

ARIMA predictions look shifted by one unit of time

I am using statsmodels ARIMA (1,2,1) to predict the monthly demand for a product. The predictions look like they are shifted to the right by one month. I wonder if the statsmodels.ARIMA.Residuals.predict returns something different than the monthly…
Gloksinya
  • 121
  • 1
2
votes
2 answers

Does t-test require Standard Deviation of sample for calculation

Might be a novice question, but the main difference between a t-test and z-test, I was able to understand, is that the z-test calculation requires the SD value of the population where as in a t-test, we do work with SD of the sample itself when…
2
votes
1 answer

How do I use number of hours as index in timeseries forecasting?

I have a dataset that has number of hours (consecutive value) and total sales in that 1 hour in my dataset. See below for head of the dataset: t sales -------------- 23 172.3676 24 176.3456 25 166.9039 26 153.9990 27 167.9585 I want to…
1
vote
2 answers

Which features are important in determining fluency of participants measured by weighted composite score?

I'm starting to learn about data science. I have sample data including different features run over several people, for each person n times. these features represent the fluency of the participants in some languages. they are including the number of…
Lili
  • 11
  • 3
1
2 3 4