问题描述
我正在尝试将随机森林与网格搜索结合使用,但出现此错误
ValueError: Invalid parameter classifier for estimator Pipeline(steps=[('tfidf_vectorizer',TfidfVectorizer()),('rf_classifier',RandomForestClassifier())]).
Check the list of available parameters with `estimator.get_params().keys()`.
import numpy as np # linear algebra
import pandas as pd
from sklearn.model_selection import GridSearchCV
from sklearn.model_selection import train_test_split
from sklearn import pipeline,ensemble,preprocessing,feature_extraction,metrics
train=pd.read_json('cleaned_data1')
#split dataset into X,Y
X=train.iloc[:,0]
Y=train.iloc[:,2]
estimators=pipeline.Pipeline([
('tfidf_vectorizer',feature_extraction.text.TfidfVectorizer(lowercase=True)),ensemble.RandomForestClassifier())
])
print(estimators.get_params().keys())
params = {"classifier__max_depth": [3,None],"classifier__max_features": [1,3,10],"classifier__min_samples_split": [1,"classifier__min_samples_leaf": [1,# "bootstrap": [True,False],"classifier__criterion": ["gini","entropy"]}
X_train,X_test,y_train,y_test=train_test_split(X,Y,test_size=0.2)
rf_classifier=GridSearchCV(estimators,params,cv=10,n_jobs=-1,scoring='accuracy',iid=True)
rf_classifier.fit(X_train,y_train)
y_pred=rf_classifier.predict(X_test)
metrics.confusion_matrix(y_test,y_pred)
print(metrics.accuracy_score(y_test,y_pred))
我已经尝试添加这些参数
param_grid = {
'n_estimators': [200,500],'max_features': ['auto','sqrt','log2'],'max_depth' : [4,5,6,7,8],'criterion' :['gini','entropy']
}
还是一样的错误
解决方法
暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!
如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。
小编邮箱:dio#foxmail.com (将#修改为@)