Questions tagged [probability-calibration]

48 questions
18
votes
4 answers

XGBoost outputs tend towards the extremes

I am currently using XGBoost for risk prediction, it seems to be doing a good job in the binary classification department but the probability outputs are way off, i.e., changing the value of a feature in an observation by a very small amount can…
12
votes
1 answer

Are the raw probabilities obtained from XGBoost, representative of the true underlying probabilties?

1) Is it feasible to use the raw probabilities obtained from XGBoost, e.g. probabilities obtained within the range of 0.4-0.5, as a true representation of approximately 40%-50% chance of an event occurring? (assuming we have an accurate model) 2)…
7
votes
1 answer

Are calibrated probabilities always more reliable?

EDIT: Based on the answer below, I have updated the question and added more detail. I have applied Dirichlet calibration to my fast-bert sentiment classification model, and I am struggling to really understand why/ if it is actually more reliable.…
6
votes
2 answers

Probability Calibration : role of hidden layer in Neural Network

I try a simple Neural Network (Logistic Regression) to play with Keras. In input I have 5,000 features (output of a simple tf-idf vectorizer) and in the output layer I just use a random uniform initialization and an $\alpha = 0.0001$ for $L_{2}$…
BimBimBap
  • 81
  • 1
  • 3
5
votes
1 answer

XGBoost: how to adjust the probabilities of a binary classifier to match training data?

Training and testing data have around 1% positives, but the model predicts only around 0.1% as positives. The model is an xgboost classifier. I’ve tried calibration but it didn’t improve much. I also don’t want to pick thresholds since the final…
5
votes
2 answers

convert predict_proba results using class_weight in training

As my dataset is unbalanced(class 1: 5%, class 0: 95%) I have used class_weight="balanced" parameter to train a random forest classification model. In this way I penalize the misclassification of a rare positive cases. rf =…
3
votes
1 answer

why does my calibration curve for platts and isotonic have less points than my uncalibrated model?

i train a model using grid search then i use the best parameters from this to define my chosen model. model = XGBClassifier() pipeline = make_pipeline(model) kfolds = StratifiedKFold(3) clf = GridSearchCV(pipeline, parameters,…
3
votes
1 answer

How to determine the correct target for classification probability when the observed samples are probabilities of each class?

I have data in which each event's outcome can be described by a probability of a categorical occurrence. For example, if all of the possible class outcomes are A, B, C, or D suppose in one event 7/10 people selected category A, 2/10 selected…
3
votes
1 answer

I have 3 graphs of a binary Logistic Regression that I want to understand better what is happening and learn of a strategy to make the model better

My problem is the following: I have a binary Logistic Regression model with a very imbalanced dataset that outputs the percentage of the prediction. As can be seen in the images, as the threshold is increased there's a certain point it stops…
3
votes
0 answers

How to explain a Calibration Plot for many models?

I have a heavy imbalanced dataset with a classification problem. I try to plot the Calibration Curve from the sklearn.calibration package. In specific, I try the following models: rft = RandomForestClassifier(n_estimators=1000) svc = SVC(probability…
Tasos
  • 3,860
  • 4
  • 22
  • 54
3
votes
1 answer

which loss function (if any) optimizes the calibration graph

The calibration graph is the predicted versus actual probability(see http://scikit-learn.org/stable/modules/generated/sklearn.calibration.calibration_curve.html). Is it possible to optimize the linearity of that curve in terms of a loss function?…
2
votes
1 answer

Calibrating probability thresholds for multiclass classification

I have built a network for the classification of three classes. The network consists of a CNN followed by two fully-connected layers. The CNN consists of convolutional layers, followed by batch normalization, a RELU activation, max pooling and drop…
2
votes
1 answer

How can i tell if my model is overfitting from the distribution of predicted probabilities?

all, i am training light gradient boosting and have used all of the necessary parameters to help in over fitting.i plot the predicted probabilities (i..e probabililty has cancer) distribution from the model (after calibrating using calibrated…
2
votes
0 answers

Imbalanced text classification by oversampling: correction of class predicted probability by prior probability

My dataset has 3 class and 900 examples for training. Class distribution is 255, 185, and 460. I found that if I oversample (random) the training data then I have to correct/calibrate the predicted probability of the test data because after…
2
votes
0 answers

Platt Scaling vs Isotonic Regression for reliability curve

I am learning classifier probability calibrations and have calibrated an eleastic net model using both Platt scaling and isotonic regression. As you can see in the attached image Platt scaling (on the bottom) better approximates the diagonal line…
1
2 3 4