问题描述
嗨,我正在尝试使用SklTreeLearner对数据集进行分类。我已经对数据进行了预处理,并将其保存到新文件中。但是,当我尝试在新保存的文件上使用学习器时,出现错误。
代码:
#Importing Pandas library
import pandas as pd
#Reading file into data frame
csv_path = 'winequality-white-v3.csv'
df0 = pd.read_csv(csv_path)
#Filtering missing values
missingchlorides = pd.isna(df0["chlorides"])
missingIndices = df0[missingchlorides].index
#Replacing missing values by mean
meanchlorides = float(df0["chlorides"].mean())
df0["chlorides"].where(~ missingchlorides,meanchlorides,inplace=True)
#Deleteing missing values at the EOF
missing_winequailty = pd.isna(df0["alcohol"])
missingIndices = df0[missing_winequailty].index
df1 = df0.drop(missingIndices,axis=0)
#Saving into a csv file
df1.to_csv('filtered-winequality-white-v3.csv')
#---------------------------------
#Importing Orange Library
from Orange.data import Table,Domain
#Importing “SklTreeLearner”
from Orange.classification import SklTreeLearner
#Reading the fileterd data
Filtered_data = Table.from_file('filtered-winequality-white-v3.csv')
#Defining features
feature_vars = list(Filtered_data.domain.variables[1:6])
class_label_var = Filtered_data.domain.variables[7]
#Defining domain
winequality_domain = Domain(feature_vars,class_label_var)
Filtered_data= Table.from_table(domain=winequality_domain,source=Filtered_data)
print(Filtered_data.domain)
print(Filtered_data.domain.variables)
print(Filtered_data.domain.attributes)
print(Filtered_data.domain.class_var)
#Shuffling and splitting data for training and testing
Filtered_data.shuffle()
train_data_tab = Filtered_data[:1800]
test_data_tab = Filtered_data[1800:]
#creating tree learner and decision tree
tree_learner = SklTreeLearner()
decision_tree = tree_learner(train_data_tab)
错误:
ValueError: discrete class variable expected.
我认为我需要将连续变量更改为谨慎变量,但不确定如何。有人可以帮忙吗?
解决方法
暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!
如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。
小编邮箱:dio#foxmail.com (将#修改为@)