自定义 XGB obj 函数

问题描述

我想先说我对 xgboost、pandas 和 numpy 的使用还很陌生。

目前我正在致力于基于 kelly 标准为 XGBoost 实现自定义 OBJ 函数。 这种方法取自 datascience.stackexchange 上的另一篇文章:https://datascience.stackexchange.com/questions/16186/kelly-criterion-in-xgboost-loss-function

通过阅读XGBoost的文档,我需要返回梯度和粗麻布。 (https://xgboost.readthedocs.io/en/latest/tutorials/custom_metric_obj.html) 函数的梯度为:

gradient

函数的hessian为:

hessian

地点:

b = 投注赔率

p = 获胜概率

x = 算法预测

为此,我将把 p 视为一个二进制变量,即 1 或 0,以判断下注是否成功。

所以,p = 真实结果,1 或 0

使用文档我编写了以下代码,我还提供了一个小样本数据集:

kell_train_data = np.array([0.08396877,0.07131547,0.17921676,0.22317006,0.06278754,0.29874458,0.08079682,0.13074108,0.06416036],0.12209199,0.10400956,0.28764891,0.2913481,0.09450234,0.07858831,0.09246751,0.17008012,0.29026032,0.2741014,0.05574227)

odds_train = np.array([0.149254,0.108696,0.312500,0.217391,0.061350,0.208333,0.178571,0.065359,0.037453,0.107527,0.256410,0.400000,0.370370,0.085470,0.058140,0.204082,0.476190,0.294118,0.121951,0.033003])

y_train = np.array([0,1,0]

kell_train_data = kell_train_data.reshape(kell_train_data.shape[0],-1)


def gradient(y_pred,y_true,odds = odds_train):
    "Compute gradient of betting function"
    
    
    return (((-(odds+1)*y_true +odds*y_pred+1)/((y_pred-1)(odds*y_pred+1))))

def hessian(y_pred,odds = odds_train):
    "compute hessian of betting function"
    
    return (-(((odds**2)*y_true )/(odds*y_pred+1)**2)-((1-y_true)/((1-y_pred)**2)))

def kellyobjfunc(y_pred,odds = odds_train) :
    "kelly objective function for xgboost"
    grad = gradient(y_pred,odds)
    hess = hessian(y_pred,odds)
    return grad,hess

kell_mod = xgb.XGBClassifier(objective = kellyobjfunc,maximize = True)

kell_mod.fit(kell_train_data,y_train)

但是,当我运行上面的代码时,出现以下错误:

Traceback (most recent call last):

  File "<ipython-input-623-18279e95b288>",line 1,in <module>
    kell_mod.fit( kell_target,y_train)

  File "C:\Users\USER\Anaconda3\lib\site-packages\xgboost\core.py",line 422,in inner_f
    return f(**kwargs)

  File "C:\Users\USER\Anaconda3\lib\site-packages\xgboost\sklearn.py",line 919,in fit
    callbacks=callbacks)

  File "C:\Users\USER\Anaconda3\lib\site-packages\xgboost\training.py",line 214,in train
    early_stopping_rounds=early_stopping_rounds)

  File "C:\Users\USER\Anaconda3\lib\site-packages\xgboost\training.py",line 101,in _train_internal
    bst.update(dtrain,i,obj)

  File "C:\Users\USERR\Anaconda3\lib\site-packages\xgboost\core.py",line 1285,in update
    grad,hess = fobj(pred,dtrain)

  File "C:\Users\USER\Anaconda3\lib\site-packages\xgboost\sklearn.py",line 49,in inner
    return func(labels,preds)

  File "<ipython-input-621-35f90873cb76>",line 14,in kellyobjfunc
    grad = gradient(y_pred,odds)

  File "<ipython-input-621-35f90873cb76>",line 5,in gradient
    return (((-(odds+1)*y_true +odds*y_pred+1)/((y_pred-1)(odds*y_pred+1))))

TypeError: 'numpy.ndarray' object is not callable

我不确定是什么导致了这个问题。 任何见解或帮助将不胜感激。

解决方法

所以我发现了错误。

在梯度函数中,括号的位置导致了错误。

def gradient(y_pred,y_true,odds = odds_train):
    "Compute gradient of betting function"
    
    
    return (((-(odds+1)*y_true +odds*y_pred+1)/((y_pred-1)(odds*y_pred+1))))

实际上应该是:

def gradient(y_pred,odds = odds_train):
    "Compute gradient of betting function"
    
    
    return (((-(odds+1) * y_true +odds * y_pred+1)/((y_pred-1)*(odds*y_pred+1))))

另外,xgb 模型应该是:

kell_mod = xgb.XGBClassifier(obj = kellyobjfunc,maximize = True)

代码现在成功执行。

相关问答

错误1:Request method ‘DELETE‘ not supported 错误还原:...
错误1:启动docker镜像时报错:Error response from daemon:...
错误1:private field ‘xxx‘ is never assigned 按Alt...
报错如下,通过源不能下载,最后警告pip需升级版本 Requirem...