使用 eli5.show_prediction() - NLP Logistic Regression (scikitlearn) - X 每个样本有 1 个特征；期待 13791

问题描述

我在 scikitlearn 中有一个带有 Tfidf 矢量化器和逻辑回归的模型管道。

我正在尝试在我的文本 (NLP) 上使用 eli5.show_prediction 函数。

## Rand is just a random integer,and feat_ns is the list of all of my features.##
## X_test is from my test/train split##
## Yes the brackets around X_test[rand] are funky but this is what the function asked for##

eli5.show_prediction(pipeline.named_steps['logr'],doc= [[X_test[rand]]],top=30,feature_names = feat_ns)

Error: X has 1 features per sample; expecting 13791

解决方法

我能够回答我自己的问题。

这是因为我的 X_test 变量还没有被我的 Tfidf 向量化器处理，因此不符合维度要求。

该函数似乎无法通过我的管道处理数据。

eli5 pandas python scikit-learn