Xgboost predict probabilities

Question

When using the python / sklearn API of xgboost are the probabilities obtained via the predict_proba method "real probabilities" or do I have to use logit:rawand manually calculate the sigmoid function?

I wanted to experiment with different cutoff points. Currently using binary:lgisticvia the sklearn:XGBClassifier the probabilities returned from the prob_a method rather resemble 2 classes and not a continuous function where changing the cut-off point impacts the final scoring.

Is this the right way to obtain probabilities for experimenting with the cutoff value?

Same question, I want to figure out if the predict_proba works too. Do you know? — RockTheStar, Jan 13 '17 at 20:49

score 3 · Answer 1 · answered Jan 09 '17 at 03:32

3

Curious Georg if you ran across this article in your pursuit of trying to generate probabilities. It is worth noting that binary:logistic and multi:softprob return predicted probability of each data point belonging to each class.

You can look here to see how the following code is used:

answered Jan 09 '17 at 03:32

Society of Data Scientists

590
5
16

Still the output does not look like probabilities. https://github.com/dmlc/xgboost/issues/1763 in the jvm version it seems to work better – Georg Heiler Jan 09 '17 at 06:39

Georg Heiler · Accepted Answer · 2017-01-30T23:21:06.327

0

LightGBM forum was the answer ;) https://github.com/Microsoft/LightGBM/issues/272#issuecomment-276168493

apparently, my model is fitting the result very good, i.e. it is very sure about the class probabilities
I did not expect such clear-cut boundaries and thus was confused if this could be correct.

edited Jan 30 '17 at 23:21

answered Jan 30 '17 at 19:47

Georg Heiler

327
2
3
13

Can you please summarize the main points from that article? If the link stops working, this answer will become useless. Also, we don't want to be just a link farm pointing to other places. (See also http://datascience.stackexchange.com/help/deleted-answers.) Thanks! – D.W. Jan 30 '17 at 23:09

Xgboost predict probabilities

2 Answers2

Linked