出现错误ValueError:预期离散类变量

问题描述

嗨,我正在尝试使用SklTreeLearner对数据集进行分类。我已经对数据进行了预处理,并将其保存到新文件中。但是,当我尝试在新保存的文件上使用学习器时,出现错误

代码


#Importing Pandas library
import pandas as pd

#Reading file into data frame
csv_path = 'winequality-white-v3.csv'
df0 = pd.read_csv(csv_path) 

#Filtering missing values
missingchlorides = pd.isna(df0["chlorides"])
missingIndices = df0[missingchlorides].index


#Replacing missing values by mean
meanchlorides = float(df0["chlorides"].mean())
df0["chlorides"].where(~ missingchlorides,meanchlorides,inplace=True)


#Deleteing missing values at the EOF
missing_winequailty = pd.isna(df0["alcohol"])
missingIndices = df0[missing_winequailty].index
df1 = df0.drop(missingIndices,axis=0)

#Saving into a csv file
df1.to_csv('filtered-winequality-white-v3.csv') 

#---------------------------------
#Importing Orange Library
from Orange.data import Table,Domain

#Importing “SklTreeLearner”
from Orange.classification import SklTreeLearner

#Reading the fileterd data
Filtered_data = Table.from_file('filtered-winequality-white-v3.csv')

#Defining features
feature_vars = list(Filtered_data.domain.variables[1:6])
class_label_var = Filtered_data.domain.variables[7]

#Defining domain
winequality_domain = Domain(feature_vars,class_label_var)
Filtered_data= Table.from_table(domain=winequality_domain,source=Filtered_data)

print(Filtered_data.domain)
print(Filtered_data.domain.variables)
print(Filtered_data.domain.attributes)
print(Filtered_data.domain.class_var)

#Shuffling and splitting data for training and testing
Filtered_data.shuffle()
train_data_tab = Filtered_data[:1800]
test_data_tab = Filtered_data[1800:]

#creating tree learner and decision tree
tree_learner = SklTreeLearner()
decision_tree = tree_learner(train_data_tab)

错误

ValueError: discrete class variable expected.

我认为我需要将连续变量更改为谨慎变量,但不确定如何。有人可以帮忙吗?

解决方法

暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!

如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@)