问题描述
变量grid.best_estimator_
包含从gridsearchcv找到的决策树模型
for subset in range(len(smol_X_train)):
temp_tree = grid.best_estimator_.fit(smol_X_train[subset],smol_y_train[subset])
pred = temp_tree.predict(X_test)
accuracy = accuracy_score(y_test,pred)
print(accuracy)
输出-
0.827
0.7025
0.782
0.7205
..
..
0.8365
0.8395
带有列表-
tree_list = []
for subset in range(len(smol_X_train)):
temp_tree = grid.best_estimator_.fit(smol_X_train[subset],smol_y_train[subset])
tree_list.append(temp_tree)
for one_tree in tree_list:
pred = one_tree.predict(X_test)
accuracy = accuracy_score(y_test,pred)
print(accuracy)
输出-
0.8395
0.8395
0.8395
0.8395
..
..
0.8395
0.8395
列表中的模型返回相同的分数(最后一个模型的分数)。
解决方法
克隆,拟合模型然后将其附加到列表中就可以了。而不是直接将模型附加到列表中。
from sklearn.base import clone
tree_list = []
for subset in range(len(smol_X_train)):
temp_tree = grid.best_estimator_.fit(smol_X_train[subset],smol_y_train[subset])
tree_list.append(clone(temp_tree))
pred = temp_tree.predict(X_test)
accuracy = accuracy_score(y_test,pred)
print(accuracy)