Questions tagged [linear-regression]

Techniques for analyzing the relationship between one (or more) "dependent" variables and "independent" variables.

"Regression" is a general term for a wide variety of techniques to analyze the relationship between one (or more) dependent variables and independent variables. Typically the dependent variables are modeled with probability distributions whose parameters are assumed to vary (deterministically) with the independent variables.

Ordinary least squares (OLS) regression affords a simple example in which the expectation of one dependent variable is assumed to depend linearly on the independent variables. The unknown coefficients in the assumed linear function are estimated by choosing values for them that minimize the sum of squared differences between the values of the dependent variable and the corresponding fitted values.

760 questions
114
votes
5 answers

Why do cost functions use the square error?

I'm just getting started with some machine learning, and until now I have been dealing with linear regression over one variable. I have learnt that there is a hypothesis, which is: $h_\theta(x)=\theta_0+\theta_1x$ To find out good values for the…
Golo Roden
  • 1,313
  • 2
  • 9
  • 6
46
votes
5 answers

How to force weights to be non-negative in Linear regression

I am using a standard linear regression using scikit-learn in python. However, I would like to force the weights to be all non-negative for every feature. is there any way I can accomplish that? I was looking in the documentation but could not find…
user
  • 1,971
  • 6
  • 20
  • 36
31
votes
3 answers

How can I check the correlation between features and target variable?

I am trying to build a Regression model and I am looking for a way to check whether there's any correlation between features and target variables? This is my sample dataset Loan_ID Gender Married Dependents Education Self_Employed…
Jeeth
  • 911
  • 2
  • 10
  • 18
26
votes
5 answers

Python library for segmented regression (a.k.a. piecewise regression)

I am looking for a Python library that can perform segmented regression (a.k.a. piecewise regression). Example:
16
votes
2 answers

When to choose linear regression or Decision Tree or Random Forest regression?

I am working on a project and I am having difficulty in deciding which algorithm to choose for regression. I want to know under what conditions should one choose a linear regression or Decision Tree regression or Random Forest regression? Are there…
16
votes
3 answers

Overfitting in Linear Regression

I'm just getting started with machine learning and I have trouble understanding how overfitting can happen in a linear regression model. Considering we use only 2 feature variables to train a model, how can a flat plane possibly be overfitted to a…
16
votes
4 answers

What does "linear in parameters" mean?

The model of linear regression is linear in parameters. What does this actually mean?
Albert Gao
  • 263
  • 1
  • 2
  • 5
13
votes
1 answer

How to do stepwise regression using sklearn?

I could not find a way to stepwise regression in scikit learn. I have checked all other posts on Stack Exchange on this topic. Answers to all of them suggests using f_regression. But f_regression does not do stepwise regression but only give F-score…
13
votes
2 answers

Why using L1 regularization over L2?

Conducting a linear regression model using a loss function, why should I use $L_1$ instead of $L_2$ regularization? Is it better at preventing overfitting? Is it deterministic (so always a unique solution)? Is it better at feature selection (because…
astudentofmaths
  • 273
  • 1
  • 4
  • 8
12
votes
1 answer

XGBoost Linear Regression output incorrect

I am a newbie to XGBoost so pardon my ignorance. Here is the python code : import pandas as pd import xgboost as xgb df = pd.DataFrame({'x':[1,2,3], 'y':[10,20,30]}) X_train = df.drop('y',axis=1) Y_train = df['y'] T_train_xgb =…
simplfuzz
  • 223
  • 1
  • 2
  • 5
11
votes
3 answers

Can GPS coordinates (latitude and longitude) be used as features in a linear model?

I have data sets that contain, among many features, GPS coordinates (latitude and longitude). I'd like to use these data sets to explore problems such as: (1) computing ETA to drive between start and end points; and (2) estimating the amount of…
10
votes
3 answers

Is there a library that would perform segmented linear regression in python?

There is a package named segmented in R. Is there a similar package in python?
vikasreddy
  • 253
  • 2
  • 6
10
votes
7 answers

Multi-country model or single model

I am working on a ML model to be deployed in a product operating in many countries. The issue that I am having is the following: should I train one model and serve it for all countries? train a model per country and serve each model in its…
10
votes
2 answers

Is it valid to shuffle time-series data for a prediction task?

I have a time-series dataset that records some participants' daily features from wearable sensors and their daily mood status. The goal is to use one day's daily features and predict the next day's mood status for participants with machine learning…
Han
  • 103
  • 1
  • 5
10
votes
2 answers

What is the Time Complexity of Linear Regression?

I am working with linear regression and I would like to know the Time complexity in big-O notation. The cost function of linear regression without an optimisation algorithm (such as Gradient descent) needs to be computed over iterations of the…
1
2 3
50 51