1 year ago
#357396
toshi71
How to implement calculating pAUC and its derivatives in python?
According to this paper, the method of maximizing pAUC is useful for unbalanced binary classification problems. Since gradient boosting libraries such as LightGBM, CatBoost and XGBoost do not implement pAUC, if you want to use pAUC as a loss function in these libraries, you will need to implement it on your own, along with the first and second derivatives. I have implemented the code to calculate pAUC but cannot determine if this is correct. Also, I do not know how to implement its derivatives. Do you have any ideas?
Assuming pandas DataFrame df has two columns named 'target' and 'pred_proba'.
import math
import numpy as np
import pandas as pd
class p_auc:
def __init__(self, alpha=0.01, beta=0.1):
self.alpha = alpha
self.beta = beta
def calculate_p_auc(self, df, target_label='target', proba_label='pred_proba'):
S_pos = df[df[target_label] == 1]
S_neg = df[df[target_label] == 0].sort_values(proba_label, ascending=False)
x_pos = np.array(S_pos.iloc[:,1])
x_neg = np.array(S_neg.iloc[:,1])
n_pos = len(x_pos)
n_neg = len(x_neg)
j_alpha = math.ceil(self.alpha * n_neg)
j_beta = math.floor(self.beta * n_neg)
res = 0
for i in range(n_pos):
res += (j_alpha - self.alpha * n_neg) * self.sigmoid(x_pos[i], x_neg[j_alpha - 1])
for j in range(j_alpha, j_beta):
res += self.sigmoid(x_pos[i], x_neg[j])
res += (self.beta * n_neg - j_beta) * self.sigmoid(x_pos[i], x_neg[j_beta])
res /= n_pos * n_neg * (self.beta - self.alpha)
return res
def sigmoid(self, a, b):
return 1 / (1 + np.exp(-1 * (a - b)))
python
xgboost
lightgbm
catboost
0 Answers
Your Answer