Highest Voted 'probability-calibration' Questions

18

votes

4 answers

XGBoost outputs tend towards the extremes

I am currently using XGBoost for risk prediction, it seems to be doing a good job in the binary classification department but the probability outputs are way off, i.e., changing the value of a feature in an observation by a very small amount can…

asked Jan 24 '18 at 00:52

alwayslearning

181
4

12

votes

1 answer

Are the raw probabilities obtained from XGBoost, representative of the true underlying probabilties?

1) Is it feasible to use the raw probabilities obtained from XGBoost, e.g. probabilities obtained within the range of 0.4-0.5, as a true representation of approximately 40%-50% chance of an event occurring? (assuming we have an accurate model) 2)…

machine-learning classification xgboost probability probability-calibration

asked Mar 08 '18 at 12:42

Gale

403
1
4
13

7

votes

1 answer

Are calibrated probabilities always more reliable?

EDIT: Based on the answer below, I have updated the question and added more detail. I have applied Dirichlet calibration to my fast-bert sentiment classification model, and I am struggling to really understand why/ if it is actually more reliable.…

machine-learning neural-network probability-calibration

asked Jan 05 '20 at 14:23

Danyal Andriano

131
3

6

votes

2 answers

Probability Calibration : role of hidden layer in Neural Network

I try a simple Neural Network (Logistic Regression) to play with Keras. In input I have 5,000 features (output of a simple tf-idf vectorizer) and in the output layer I just use a random uniform initialization and an $\alpha = 0.0001$ for $L_{2}$…

neural-network keras probability-calibration

asked Nov 19 '18 at 19:03

BimBimBap

81
1
3

5

votes

1 answer

XGBoost: how to adjust the probabilities of a binary classifier to match training data?

Training and testing data have around 1% positives, but the model predicts only around 0.1% as positives. The model is an xgboost classifier. I’ve tried calibration but it didn’t improve much. I also don’t want to pick thresholds since the final…

machine-learning python xgboost probability-calibration

asked Jan 31 '20 at 17:09

Henrique Nader

511
2
5
15

5

votes

2 answers

convert predict_proba results using class_weight in training

As my dataset is unbalanced(class 1: 5%, class 0: 95%) I have used class_weight="balanced" parameter to train a random forest classification model. In this way I penalize the misclassification of a rare positive cases. rf =…

classification scikit-learn random-forest class-imbalance probability-calibration

asked Jul 02 '19 at 16:43

srl

51
1
2

3

votes

1 answer

why does my calibration curve for platts and isotonic have less points than my uncalibrated model?

i train a model using grid search then i use the best parameters from this to define my chosen model. model = XGBClassifier() pipeline = make_pipeline(model) kfolds = StratifiedKFold(3) clf = GridSearchCV(pipeline, parameters,…

python xgboost probability grid-search probability-calibration

asked Jul 09 '20 at 16:21

Maths12

496
5
14

3

votes

1 answer

How to determine the correct target for classification probability when the observed samples are probabilities of each class?

I have data in which each event's outcome can be described by a probability of a categorical occurrence. For example, if all of the possible class outcomes are A, B, C, or D suppose in one event 7/10 people selected category A, 2/10 selected…

classification probability-calibration

asked Jun 30 '20 at 18:27

user3327134

33
3

3

votes

1 answer

I have 3 graphs of a binary Logistic Regression that I want to understand better what is happening and learn of a strategy to make the model better

My problem is the following: I have a binary Logistic Regression model with a very imbalanced dataset that outputs the percentage of the prediction. As can be seen in the images, as the threshold is increased there's a certain point it stops…

machine-learning classification logistic-regression class-imbalance probability-calibration

asked Mar 05 '20 at 16:26

Gabriel Almeida

31
2

3

votes

0 answers

How to explain a Calibration Plot for many models?

I have a heavy imbalanced dataset with a classification problem. I try to plot the Calibration Curve from the sklearn.calibration package. In specific, I try the following models: rft = RandomForestClassifier(n_estimators=1000) svc = SVC(probability…

class-imbalance probability-calibration

asked Apr 25 '19 at 10:11

Tasos

3,860
4
22
54

3

votes

1 answer

which loss function (if any) optimizes the calibration graph

The calibration graph is the predicted versus actual probability(see http://scikit-learn.org/stable/modules/generated/sklearn.calibration.calibration_curve.html). Is it possible to optimize the linearity of that curve in terms of a loss function?…

machine-learning loss-function probability probability-calibration

asked Aug 01 '16 at 14:44

Hanan Shteingart

329
1
7

2

votes

1 answer

Calibrating probability thresholds for multiclass classification

I have built a network for the classification of three classes. The network consists of a CNN followed by two fully-connected layers. The CNN consists of convolutional layers, followed by batch normalization, a RELU activation, max pooling and drop…

machine-learning classification class-imbalance confusion-matrix probability-calibration

asked Dec 20 '20 at 10:06

machinery

236
2
9

2

votes

1 answer

How can i tell if my model is overfitting from the distribution of predicted probabilities?

all, i am training light gradient boosting and have used all of the necessary parameters to help in over fitting.i plot the predicted probabilities (i..e probabililty has cancer) distribution from the model (after calibrating using calibrated…

python classification probability lightgbm probability-calibration

asked Jul 24 '20 at 15:41

Maths12

496
5
14

2

votes

0 answers

Imbalanced text classification by oversampling: correction of class predicted probability by prior probability

My dataset has 3 class and 900 examples for training. Class distribution is 255, 185, and 460. I found that if I oversample (random) the training data then I have to correct/calibrate the predicted probability of the test data because after…

deep-learning nlp class-imbalance probability-calibration

asked Apr 27 '20 at 12:15

user3363813

261
2
6

2

votes

0 answers

Platt Scaling vs Isotonic Regression for reliability curve

I am learning classifier probability calibrations and have calibrated an eleastic net model using both Platt scaling and isotonic regression. As you can see in the attached image Platt scaling (on the bottom) better approximates the diagonal line…

machine-learning classification r probability-calibration elastic-net

asked Apr 21 '20 at 06:03

yl637

21
2

Questions tagged [probability-calibration]