使用LIME解读自定义bert模型

问题描述

我试图解释我的模型做出决定的依据。我有一个文本分类模型，它是 Bert 的混合体，在最后一层我使用了 LSTM。该模型运行良好，我获得了不错的 F1 分数。现在，我想提供一篇测试文章并检查模型的可解释性。我正在尝试实施 LIME，但出现错误。这是我的代码。

import lime
from lime.lime_text import LimeTextExplainer

str_to_predict = article_body_df.text.values[0]
class_names=['High probability fake ','High probability real','fake','real']
explainer = LimeTextExplainer(class_names=class_names)

# THis is a class which takes url,bert model,and hybrid model as input and methods returns,pricted labels
prediction = predict(url,model=model,model_roberta=model_robert)
lab_prob = prediction.pred_label(url,model,model_robert)
# lab_prob = 'real'

#Now I am trying to interpret
    exp = explainer.explain_instance(str_to_predict,prediction.pred_label(url,model_roberta=model_robert))#,num_features=20)
    exp.show_in_notebook(text=str_to_predict)

现在这是我得到的错误

    TypeError                                 Traceback (most recent 
 call last)
  <ipython-input-192-31aece08ff2f> in <module>()
----> 1 exp = explainer.explain_instance(str_to_predict,num_features=20)
      2 exp.show_in_notebook(text=str_to_predict)

1 frames
/usr/local/lib/python3.6/dist-packages/lime/lime_text.py in explain_instance(self,text_instance,classifier_fn,labels,top_labels,num_features,num_samples,distance_metric,model_regressor)
    413         data,yss,distances = self.__data_labels_distances(
    414             indexed_string,--> 415             distance_metric=distance_metric)
    416         if self.class_names is None:
    417             self.class_names = [str(x) for x in range(yss[0].shape[0])]

/usr/local/lib/python3.6/dist-packages/lime/lime_text.py in __data_labels_distances(self,indexed_string,distance_metric)
    480             data[i,inactive] = 0
    481             inverse_data.append(indexed_string.inverse_removing(inactive))
--> 482         labels = classifier_fn(inverse_data)
    483         distances = distance_fn(sp.sparse.csr_matrix(data))
    484         return data,distances

TypeError: 'str' object is not callable

我读了几篇文章并看到了一些例子。大多数人都在使用 sklearn，并且内置了 pred 函数（model.pred），我也不能简单地将输入提供给模型，它是一个自定义模型，并需要许多其他输入和操作，这在我的情况下不起作用。 #output=model(**input) X

我看到一些使用 Bert 的例子，他们正在对输入进行矢量化并提供给解释器，我也尝试过，然后它给了我错误，它是期望的字符串。

我有点困惑，请帮忙-

我的 predict 函数将 url 作为输入，并在其中转换为 body ，然后是数据帧，然后是标记化，并提供给模型，然后使用 argmax 预测标签。

提前致谢

解决方法

暂无找到可以解决该程序问题的有效方法，小编努力寻找整理中！

如果你已经找到好的解决方法，欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@）

bert-language-model interpretation lime multiclass-classification python