Select threshold (cut-off point )for binary classification by desired fpr persentage value

Question

I want to recreate catboost.utils.select_threshold(desc) method for CalibratedClassifierCV model.

In Catboost I can select desired fpr value, to return the boundary at which the given FPR value is reached.

My goal is to the same logic after computing fpr, tpr and boundaries from sklearn.metrics.roc_curve

I have the following code

prob_pred = model.predict_proba(X[features_list])[:, 1]
            
fpr, tpr, thresholds = metrics.roc_curve(X['target'], prob_pred)

optimal_idx = np.argmax(tpr - fpr) # here I need to use FPR=0.1
boundary = thresholds[optimal_idx]
 
binary_pred = [1 if i >= boundary else 0 for i in prob_pred]

I guess it should be simple formula but I am not sure how to place 0.1 value here to adjust threshold.

score 1 · Accepted Answer · answered Sep 02 '22 at 10:38

I've done my research and testing and it's that simple:

def select_treshold(proba, target, fpr_max = 0.1 ):
    # calculate roc curves
    fpr, tpr, thresholds = roc_curve(target, proba)
    # get the best threshold with fpr <=0.1
    best_treshold = thresholds[fpr <= fpr_max][-1]
    
    return best_treshold

Select threshold (cut-off point )for binary classification by desired fpr persentage value

1 Answers1

Linked