XGBClasifier和GridSearchCV / cross_val_score:得分为“ neg_log_loss”的问题

问题描述

我正在XGBClassifier上执行gridsearchcv,并且要尽早停止,我想将“ neg_log_loss”用作评分函数。如果我运行以下代码

from xgboost import XGBClassifier
xgb_clsf = XGBClassifier()

X_train,X_val,y_train,y_val = train_test_split(dataset_prepared_stand,dataset_labels,random_state=42)

param_grid = {
    'n_estimators': [2000],'learning_rate': [0.05,0.5,1.],'max_depth': [5,10,20],}

fit_params={"early_stopping_rounds": 250,"eval_metric": "logloss","eval_set": [[X_val,y_val]],"verbose": 0}

scores = ['neg_log_loss','roc_auc','accuracy','f1']

grid_search_xgb_clsf = gridsearchcv(xgb_clsf,param_grid,cv=KFold(n_splits=3,random_state=42,shuffle=True),scoring=scores,refit='neg_log_loss',return_train_score=True,verbose=100)

grid_search_xgb_clsf.fit(X_train,**fit_params)

我收到以下错误

 RuntimeWarning: divide by zero encountered in logloss = -(transformed_labels * np.log(y_pred)).sum(axis=1)
 RuntimeWarning: invalid value encountered in multiply loss = -(transformed_labels * np.log(y_pred)).sum(axis=1)

为了避免发生此错误,我尝试用scores变量代替度量log_loss来定义负对数丢失,并且该变量有效(未显示错误):

def _score_func(estimator,X,y):
    score = log_loss(y,estimator.predict_proba(X))
    return -score
scores = {'neg_log_loss': _score_func,'roc_auc': make_scorer(roc_auc_score),'accuracy': make_scorer(accuracy_score),'f1': make_scorer(f1_score)}

这是一个错误还是我做错了?我想保持相同的行为,以避免为XGBClassifier定义与其他模型不同的函数。 如果我只是在评分中使用“ neg_log_loss”运行cross_validate,就会发生同样的情况。

解决方法

暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!

如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@)