Residuals in a gradient boosted classification

Question

I know that we iteratively model the residuals in case of a gradient boosted regression problem. The intuition is very well explained at kaggle.

Can someone explain what are the residuals that are modeled in case of a classification scenario?

score 3 · Answer 1 · answered Aug 22 '17 at 08:25

It's a similar trick to logistic regression. We use an unbounded value that we can map to a probability by using the sigmoid function. Only at the end of the gradient boosting tree model we map it to a probability. The loss function used for deciding the weights of the terminal nodes is adapted from the normal sigmoid loss to not have to map directly to probabilities. I cannot find it at the moment but it should be easy to derive.

EDIT: I found the post that I read this from the other day, it's over at stats stackexchange: https://stats.stackexchange.com/questions/204154/classification-with-gradient-boosting-how-to-keep-the-prediction-in-0-1

Residuals in a gradient boosted classification

1 Answers1