UndefinedMetricWarning：精度定义不明确，由于没有预测样本，因此设置为0.0

问题描述

我正在不平衡数据集上运行线性SVC分类器。目标变量是二进制，我对少数类进行了上采样。这是代码：

#Separate input features and target
y = df.tickets_class
X = df.drop('tickets_class',axis = 1)

#Setting up testing and training sets
X_train,X_test,y_train,y_test = train_test_split(X,y,test_size=0.25,random_state=27)

X = pd.concat([X_train,y_train],axis = 1)

no_tickets = X[X.tickets_class == 0]
tickets = X[X.tickets_class == 1]

# upsample minority
tickets_upsampled = resample(tickets,replace=True,n_samples=len(no_tickets),random_state=27)

# combine majority and upsampled minority
upsampled = pd.concat([no_tickets,tickets_upsampled])

此后，我使用上采样数据定义新的X_train和y_train：

y_train = upsampled.tickets_class
X_train = upsampled.drop('tickets_class',axis = 1)

然后，我以非常简单的方式运行Linear SVC：

clf = svm.LinearSVC(max_iter = 10000,dual = False)
clf.fit(X_train,y_train)
clf_pred = clf.predict(X_test)

最后，当我绘制模型结果时，出现此错误：

 UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. 
 Use `zero_division` parameter to control this behavior. _warn_prf(average,modifier,msg_start,len(result))

我知道我会收到此错误，因为该模型仅预测目标变量中的零。但是我的问题是：如果在上采样后两个类都具有相同数量的样本，那怎么可能？

解决方法

暂无找到可以解决该程序问题的有效方法，小编努力寻找整理中！

如果你已经找到好的解决方法，欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@）

python python-2.7 python-3.x