使用partykit、插入符号、配方时“出现问题;缺少所有准确度指标值”

问题描述

我正在尝试使用配方(用于一致的预处理)和插入符号(用于一致的训练)来训练大约 15 个机器学习模型。只有 2 个模型始终给我错误“出现问题;缺少所有准确度指标值”在 partykit 包中——cforest 和 ctree。 下面我使用来自 mlbench 的 PimaIndiansDiabetes 数据集显示错误

my_rec <- recipe(diabetes ~ .,data = PimaIndiansDiabetes) %>%
  step_dummy(all_nominal(),-diabetes)%>%
  step_nzv(all_predictors())

fitControl5 <- trainControl(summaryFunction = twoClassSummary,verboseIter = TRUE,savePredictions =  TRUE,sampling = "smote",method = "repeatedcv",number= 5,repeats = 1,classprobs = TRUE)

dtree5 <- train(my_rec,data = PimaIndiansDiabetes,method = "cforest",metric = "Accuracy",tuneLength = 8,trainControl = fitControl5)

note: only 7 unique complexity parameters in default grid. Truncating the grid to 7 .

Something is wrong; all the Accuracy metric values are missing:
    Accuracy       Kappa    
 Min.   : NA   Min.   : NA  
 1st Qu.: NA   1st Qu.: NA  
 Median : NA   Median : NA  
 Mean   :NaN   Mean   :NaN  
 3rd Qu.: NA   3rd Qu.: NA  
 Max.   : NA   Max.   : NA  
 NA's   :7     NA's   :7    
Error: Stopping
In addition: There were 50 or more warnings (use warnings() to see the first 50)

下面是方法ctree的代码

dtree6 <- train(my_rec,method = "ctree",trainControl = fitControl5)
Something is wrong; all the Accuracy metric values are missing:
    Accuracy       Kappa    
 Min.   : NA   Min.   : NA  
 1st Qu.: NA   1st Qu.: NA  
 Median : NA   Median : NA  
 Mean   :NaN   Mean   :NaN  
 3rd Qu.: NA   3rd Qu.: NA  
 Max.   : NA   Max.   : NA  
 NA's   :8     NA's   :8    
Error: Stopping
In addition: There were 50 or more warnings (use warnings() to see the first 50)

非常感谢您的帮助!

解决方法

参数应该是 trControl = 而不是 trainControl = 。如果我运行下面它的作品:

dtree5 <- train(my_rec,data = PimaIndiansDiabetes,method = "cforest",metric = "Accuracy",tuneLength = 3,trControl = fitControl5)

输出:

dtree5
Conditional Inference Random Forest 

768 samples
  8 predictor
  2 classes: 'neg','pos' 

Recipe steps: dummy,nzv 
Resampling: Cross-Validated (5 fold,repeated 1 times) 
Summary of sample sizes: 614,615,614,614 
Addtional sampling using SMOTE

Resampling results across tuning parameters:

  mtry  ROC        Sens   Spec     
  2     0.8298281  0.788  0.7013277
  5     0.8256038  0.794  0.7013277
  8     0.8222572  0.798  0.7276031

ROC was used to select the optimal model using the largest value.
The final value used for the model was mtry = 2.