如何计算插入符号中准确度和 kappa 的 95% CI

问题描述

我正在使用 caret 包运行 k 次重复训练，并想计算我的准确度指标的置信区间。本教程打印一个插入符号训练对象，显示准确率/kappa 指标和相关的 SD：https://machinelearningmastery.com/tune-machine-learning-algorithms-in-r/。但是，当我这样做时，列出的只是指标平均值。

control <- trainControl(method="repeatedcv",number=10,repeats=3,search="grid")
set.seed(12345)
tunegrid <- expand.grid(.mtry=4)
rf_gridsearch <- train(as.factor(gear)~.,data=mtcars,method="rf",metric="Accuracy",tuneGrid=tunegrid,trControl=control)
print(rf_gridsearch)

> print(rf_gridsearch)
Random Forest 

32 samples
10 predictors
 3 classes: '3','4','5' 

No pre-processing
resampling: Cross-Validated (10 fold,repeated 3 times) 
Summary of sample sizes: 29,28,30,29,27,... 
resampling results:

  Accuracy   Kappa    
  0.8311111  0.7021759

Tuning parameter 'mtry' was held constant at a value of 4

解决方法

看起来它存储在结果对象的结果变量中。

> rf_gridsearch$results
  mtry  Accuracy     Kappa AccuracySD   KappaSD
1    4 0.7572222 0.6046465  0.2088411 0.3387574

使用临界 z 值 1.96 可以找到 95% 的置信区间。

> rf_gridsearch$results$Accuracy+c(-1,1)*1.96*rf_gridsearch$results$AccuracySD
[1] 0.3478936 1.1665509

cohen-kappa confidence-interval r r r-caret random-forest