错误 sklearn2pmml 使用 VotingClassifier

问题描述

我是编程新手,在 pmml 中保存模型时遇到了一些麻烦。我有一个数据库,我需要选择属性,然后使用多数票,最后保存在 pmml 中。即使是多数投票部分也能工作,但是当我使用 sklearn2pmml 在最后一行保存模型时,它会出错。

quser "user 1"

错误

from pandas import read_csv
from sklearn.tree import DecisionTreeClassifier
from sklearn.model_selection import train_test_split
from mlxtend.classifier import EnsembleVoteClassifier
from sklearn.metrics import accuracy_score
from sklearn2pmml import make_pmml_pipeline
from sklearn2pmml import sklearn2pmml
from sklearn.compose import ColumnTransformer,make_column_transformer
from sklearn.pipeline import Pipeline
from sklearn.impute import SimpleImputer
from sklearn.preprocessing import StandardScaler,OneHotEncoder
from sklearn2pmml.pipeline import PMMLPipeline
from sklearn.ensemble._voting import VotingClassifier

#from sklearn.ensemble.voting import VotingClassifier
import joblib
from sklearn.metrics import precision_score #precision
from sklearn.metrics import recall_score #recall
import numpy as np
from sklearn.metrics import precision_recall_fscore_support
    
url = 'D:/treinamento.CSV'
df = read_csv(url,header=None)
data = df.values

url_test = 'D:/TESTE.CSV'
df_test = read_csv(url_test,header=None)
data_test = df_test.values

X = data[:,:-1]
y = data_test[:,-1]

#features selection
features1 = [2,5,7]
features2 = [0,1,4,7]
features3 = [0,6]
features4 = [1,4]

numeric_transformer = Pipeline(steps=[('scaler',StandardScaler())])
preprocessor1 = ColumnTransformer(transformers=[('numerical',numeric_transformer,features1)])
preprocessor2 = ColumnTransformer(transformers=[('numerical',features2)])
preprocessor3 = ColumnTransformer(transformers=[('numerical',features3)])
preprocessor4 = ColumnTransformer(transformers=[('numerical',features4)])

pipeline = PMMLPipeline([
  ("classifier",VotingClassifier([
    ("pipe1",Pipeline(steps=[('preprocessor',preprocessor1),('classifier',DecisionTreeClassifier(min_samples_split = 2))])),("pipe2",preprocessor2),("pipe3",preprocessor3),("pipe4",preprocessor4),DecisionTreeClassifier(min_samples_split = 2))]))
  ]))
])

pipeline.fit(X_train,y_train)
yhat = pipeline.predict(X_test)
accuracy = accuracy_score(y_test,yhat)
print('Accuracy: %.3f' % (accuracy * 100))
print(yhat)
sklearn2pmml(pipeline,"D:/FOLD/eclf.pmml",with_repr = True)

解决方法

暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!

如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@)