在LightGBM中使用'predict_contrib'获取SHAP值

问题描述

在LightGBM documentation中,有人说可以设置predict_contrib=True来预测SHAP值。

我们如何提取SHAP值(除了使用shap包之外)?

我尝试过

model = LGBM(objective="binary",is_unbalance=True,predict_contrib=True)
model.fit(X_train,y_train)
pred_shap = opt_model.predict(X_train) #Does not get SHAP-values

似乎无效

解决方法

Shap会用LGBM来评价pred_contrib=True的方式:

from lightgbm.sklearn import LGBMClassifier
from sklearn.datasets import load_iris

X,y = load_iris(return_X_y=True)
lgbm = LGBMClassifier()
lgbm.fit(X,y)
lgbm_shap = lgbm.predict(X,pred_contrib=True)
# Shape of returned LGBM shap values: 4 features x 3 classes + 3 expected values over the training dataset
print(lgbm_shap.shape)
# 0th row of LGBM shap values for 0th feature
print(lgbm_shap[0,:4])

输出:

(150,15)
[-0.0176954   0.50644615  5.56584344  3.43032313]

shap中的Shap值:

import shap
explainer = shap.TreeExplainer(lgbm)
shap_values = explainer.shap_values(X)
# num of predicted classes
print(len(shap_values))
# shap values for 0th class for 0th row
print(shap_values[0][0])

输出:

3
array([-0.0176954,0.50644615,5.56584344,3.43032313])

我也一样。

,

由于两个不同的lightgbm API中的控制参数重复(命名不一致)而引起混淆。

两个主要API均使用自己的拼写:

documentation支持C版本(Python API拼写甚至不被视为别名...)