自定义 XGB obj 函数

问题描述

我想先说我对 xgboost、pandas 和 numpy 的使用还很陌生。

目前我正在致力于基于 kelly 标准为 XGBoost 实现自定义 OBJ 函数。这种方法取自 datascience.stackexchange 上的另一篇文章：https://datascience.stackexchange.com/questions/16186/kelly-criterion-in-xgboost-loss-function

通过阅读XGBoost的文档，我需要返回梯度和粗麻布。 (https://xgboost.readthedocs.io/en/latest/tutorials/custom_metric_obj.html) 函数的梯度为：

函数的hessian为：

地点：

b = 投注赔率

p = 获胜概率

x = 算法预测

为此，我将把 p 视为一个二进制变量，即 1 或 0，以判断下注是否成功。

所以，p = 真实结果，1 或 0

使用文档我编写了以下代码，我还提供了一个小样本数据集：

kell_train_data = np.array([0.08396877,0.07131547,0.17921676,0.22317006,0.06278754,0.29874458,0.08079682,0.13074108,0.06416036],0.12209199,0.10400956,0.28764891,0.2913481,0.09450234,0.07858831,0.09246751,0.17008012,0.29026032,0.2741014,0.05574227)

odds_train = np.array([0.149254,0.108696,0.312500,0.217391,0.061350,0.208333,0.178571,0.065359,0.037453,0.107527,0.256410,0.400000,0.370370,0.085470,0.058140,0.204082,0.476190,0.294118,0.121951,0.033003])

y_train = np.array([0,1,0]

kell_train_data = kell_train_data.reshape(kell_train_data.shape[0],-1)


def gradient(y_pred,y_true,odds = odds_train):
    "Compute gradient of betting function"
    
    
    return (((-(odds+1)*y_true +odds*y_pred+1)/((y_pred-1)(odds*y_pred+1))))

def hessian(y_pred,odds = odds_train):
    "compute hessian of betting function"
    
    return (-(((odds**2)*y_true )/(odds*y_pred+1)**2)-((1-y_true)/((1-y_pred)**2)))

def kellyobjfunc(y_pred,odds = odds_train) :
    "kelly objective function for xgboost"
    grad = gradient(y_pred,odds)
    hess = hessian(y_pred,odds)
    return grad,hess

kell_mod = xgb.XGBClassifier(objective = kellyobjfunc,maximize = True)

kell_mod.fit(kell_train_data,y_train)

但是，当我运行上面的代码时，出现以下错误：

Traceback (most recent call last):

  File "<ipython-input-623-18279e95b288>",line 1,in <module>
    kell_mod.fit( kell_target,y_train)

  File "C:\Users\USER\Anaconda3\lib\site-packages\xgboost\core.py",line 422,in inner_f
    return f(**kwargs)

  File "C:\Users\USER\Anaconda3\lib\site-packages\xgboost\sklearn.py",line 919,in fit
    callbacks=callbacks)

  File "C:\Users\USER\Anaconda3\lib\site-packages\xgboost\training.py",line 214,in train
    early_stopping_rounds=early_stopping_rounds)

  File "C:\Users\USER\Anaconda3\lib\site-packages\xgboost\training.py",line 101,in _train_internal
    bst.update(dtrain,i,obj)

  File "C:\Users\USERR\Anaconda3\lib\site-packages\xgboost\core.py",line 1285,in update
    grad,hess = fobj(pred,dtrain)

  File "C:\Users\USER\Anaconda3\lib\site-packages\xgboost\sklearn.py",line 49,in inner
    return func(labels,preds)

  File "<ipython-input-621-35f90873cb76>",line 14,in kellyobjfunc
    grad = gradient(y_pred,odds)

  File "<ipython-input-621-35f90873cb76>",line 5,in gradient
    return (((-(odds+1)*y_true +odds*y_pred+1)/((y_pred-1)(odds*y_pred+1))))

TypeError: 'numpy.ndarray' object is not callable

我不确定是什么导致了这个问题。任何见解或帮助将不胜感激。

解决方法

所以我发现了错误。

在梯度函数中，括号的位置导致了错误。

def gradient(y_pred,y_true,odds = odds_train):
    "Compute gradient of betting function"
    
    
    return (((-(odds+1)*y_true +odds*y_pred+1)/((y_pred-1)(odds*y_pred+1))))

实际上应该是：

def gradient(y_pred,odds = odds_train):
    "Compute gradient of betting function"
    
    
    return (((-(odds+1) * y_true +odds * y_pred+1)/((y_pred-1)*(odds*y_pred+1))))

另外，xgb 模型应该是：

kell_mod = xgb.XGBClassifier(obj = kellyobjfunc,maximize = True)

代码现在成功执行。

numpy objective-function python xgboost

自定义 XGB obj 函数

问题描述

解决方法

相关问答