问题描述
我正在尝试使用配方(用于一致的预处理)和插入符号(用于一致的训练)来训练大约 15 个机器学习模型。只有 2 个模型始终给我错误“出现问题;缺少所有准确度指标值”在 partykit 包中——cforest 和 ctree。 下面我使用来自 mlbench 的 PimaIndiansDiabetes 数据集显示错误。
my_rec <- recipe(diabetes ~ .,data = PimaIndiansDiabetes) %>%
step_dummy(all_nominal(),-diabetes)%>%
step_nzv(all_predictors())
fitControl5 <- trainControl(summaryFunction = twoClassSummary,verboseIter = TRUE,savePredictions = TRUE,sampling = "smote",method = "repeatedcv",number= 5,repeats = 1,classprobs = TRUE)
dtree5 <- train(my_rec,data = PimaIndiansDiabetes,method = "cforest",metric = "Accuracy",tuneLength = 8,trainControl = fitControl5)
note: only 7 unique complexity parameters in default grid. Truncating the grid to 7 .
Something is wrong; all the Accuracy metric values are missing:
Accuracy Kappa
Min. : NA Min. : NA
1st Qu.: NA 1st Qu.: NA
Median : NA Median : NA
Mean :NaN Mean :NaN
3rd Qu.: NA 3rd Qu.: NA
Max. : NA Max. : NA
NA's :7 NA's :7
Error: Stopping
In addition: There were 50 or more warnings (use warnings() to see the first 50)
下面是方法ctree的代码
dtree6 <- train(my_rec,method = "ctree",trainControl = fitControl5)
Something is wrong; all the Accuracy metric values are missing:
Accuracy Kappa
Min. : NA Min. : NA
1st Qu.: NA 1st Qu.: NA
Median : NA Median : NA
Mean :NaN Mean :NaN
3rd Qu.: NA 3rd Qu.: NA
Max. : NA Max. : NA
NA's :8 NA's :8
Error: Stopping
In addition: There were 50 or more warnings (use warnings() to see the first 50)
非常感谢您的帮助!
解决方法
参数应该是 trControl =
而不是 trainControl =
。如果我运行下面它的作品:
dtree5 <- train(my_rec,data = PimaIndiansDiabetes,method = "cforest",metric = "Accuracy",tuneLength = 3,trControl = fitControl5)
输出:
dtree5
Conditional Inference Random Forest
768 samples
8 predictor
2 classes: 'neg','pos'
Recipe steps: dummy,nzv
Resampling: Cross-Validated (5 fold,repeated 1 times)
Summary of sample sizes: 614,615,614,614
Addtional sampling using SMOTE
Resampling results across tuning parameters:
mtry ROC Sens Spec
2 0.8298281 0.788 0.7013277
5 0.8256038 0.794 0.7013277
8 0.8222572 0.798 0.7276031
ROC was used to select the optimal model using the largest value.
The final value used for the model was mtry = 2.