3

The calibration graph is the predicted versus actual probability(see http://scikit-learn.org/stable/modules/generated/sklearn.calibration.calibration_curve.html). Is it possible to optimize the linearity of that curve in terms of a loss function? Does the log-loss actually optimizes this curve in terms of KL divergence?

Ben Reiniger
  • 11,094
  • 3
  • 16
  • 53

1 Answers1

1

Log-loss optimizes your predictions in terms of their probability so in essence, yes it should be optimizing your calibration curve. i.e. if you predict probability of 0% but it's actually labelled as the true class, this is penalised more than probability of 5% being true when it is the true class and so forth.

gkennos
  • 131
  • 2