How to implement calculating pAUC and its derivatives in python?

1 year ago

#357396

toshi71

According to this paper, the method of maximizing pAUC is useful for unbalanced binary classification problems. Since gradient boosting libraries such as LightGBM, CatBoost and XGBoost do not implement pAUC, if you want to use pAUC as a loss function in these libraries, you will need to implement it on your own, along with the first and second derivatives. I have implemented the code to calculate pAUC but cannot determine if this is correct. Also, I do not know how to implement its derivatives. Do you have any ideas?

Assuming pandas DataFrame df has two columns named 'target' and 'pred_proba'.

import math
import numpy as np
import pandas as pd

class p_auc:
    def __init__(self, alpha=0.01, beta=0.1):
        self.alpha = alpha
        self.beta = beta
    
    def calculate_p_auc(self, df, target_label='target', proba_label='pred_proba'):
        S_pos = df[df[target_label] == 1]
        S_neg = df[df[target_label] == 0].sort_values(proba_label, ascending=False)

        x_pos = np.array(S_pos.iloc[:,1])
        x_neg = np.array(S_neg.iloc[:,1])
        n_pos = len(x_pos)
        n_neg = len(x_neg)
        j_alpha = math.ceil(self.alpha * n_neg)
        j_beta = math.floor(self.beta * n_neg)

        res = 0
        for i in range(n_pos):
            res += (j_alpha - self.alpha * n_neg) * self.sigmoid(x_pos[i], x_neg[j_alpha - 1])
            for j in range(j_alpha, j_beta):
                res += self.sigmoid(x_pos[i], x_neg[j])
            res += (self.beta * n_neg - j_beta) * self.sigmoid(x_pos[i], x_neg[j_beta])
        res /= n_pos * n_neg * (self.beta - self.alpha)
        
        return res
    
    def sigmoid(self, a, b):
        return 1 / (1 + np.exp(-1 * (a - b)))

python

xgboost

lightgbm

catboost

0 Answers

Your Answer

Posts

Questions

Blogs

Jobs