问题描述
我使用的是 R 编程语言。我正在尝试按照此处的教程进行操作:https://cran.r-project.org/web/packages/lime/vignettes/Understanding_lime.html
我尝试创建自己的数据来复制本教程:
#load libraries
library(MASS)
library(lime)
library(randomForest)
#create data
var_1<- rnorm(100,1,4)
var_2 <-rnorm(10,10,5)
var_3<- c("0","2","4")
var_3 <- sample(var_3,100,replace=TRUE,prob=c(0.3,0.6,0.1))
response<- c("1","0")
response <- sample(response,0.7))
#put them into a data frame called "f"
f <- data.frame(var_1,var_2,var_3,response)
#declare var_3 and response_variable as factors
f$var_3 = as.factor(f$var_3)
f$response = as.factor(f$response)
# run random forest on all the data except the first observation
model<-randomForest(response ~.,data = f[-1,],mtry=2,ntree=100)
model<-as_classifier(model,labels = NULL)
#run the "lime" procedure on the first observation
explainer <- lime(f[-1,model,bin_continuous = TRUE,quantile_bins = FALSE)
explanation <- explain(f[-1,explainer,n_labels = 1,n_features = 4)
#visualize the results - here is the error:
plot_features(explanation,ncol = 1)
Error in if (nrow(explanation) == 0) stop("No explanations to plot",call. = FALSE) :
argument is of length zero
有人可以告诉我我做错了什么吗?是不是因为这个过程不是为了在一次观察中运行?
谢谢
更新:如果我更改这行代码:
model<-randomForest(response ~.,ntree=100)
到
model<-randomForest(response ~.,data = f,ntree=100)
代码现在似乎可以运行(这不是什么大问题,我可以在运行这一步之前编写 f = f[-1,]
和 f_new = f[1,]
),但是视觉图没有完全显示出来。这是我的图形控制台的问题吗? (注意:来自网站的教程可以完美运行)
> sessionInfo()
R version 4.0.2 (2020-06-22)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 18363)
Matrix products: default
locale:
[1] LC_COLLATE=English_Canada.1252 LC_CTYPE=English_Canada.1252 LC_MONETARY=English_Canada.1252
[4] LC_NUMERIC=C LC_TIME=English_Canada.1252
attached base packages:
[1] stats graphics Grdevices utils datasets methods base
other attached packages:
[1] randomForest_4.6-14 lime_0.5.1 MASS_7.3-53
loaded via a namespace (and not attached):
[1] Rcpp_1.0.5 lubridate_1.7.9 lattice_0.20-41 class_7.3-17 assertthat_0.2.1
[6] glmnet_4.0-2 digest_0.6.25 ipred_0.9-9 foreach_1.5.1 mime_0.9
[11] R6_2.4.1 plyr_1.8.6 stats4_4.0.2 ggplot2_3.3.2 pillar_1.4.6
[16] rlang_0.4.7 caret_6.0-86 rstudioapi_0.11 data.table_1.12.8 rpart_4.1-15
[21] Matrix_1.2-18 shinythemes_1.1.2 labeling_0.3 splines_4.0.2 gower_0.2.2
[26] stringr_1.4.0 htmlwidgets_1.5.2 munsell_0.5.0 tinytex_0.26 shiny_1.5.0
[31] compiler_4.0.2 httpuv_1.5.4 xfun_0.15 pkgconfig_2.0.3 shape_1.4.5
[36] htmltools_0.5.0 nnet_7.3-14 tidyselect_1.1.0 tibble_3.0.3 prodlim_2019.11.13
[41] codetools_0.2-16 Crayon_1.3.4 dplyr_1.0.2 withr_2.3.0 later_1.1.0.1
[46] recipes_0.1.13 ModelMetrics_1.2.2.2 grid_4.0.2 nlme_3.1-149 xtable_1.8-4
[51] gtable_0.3.0 lifecycle_0.2.0 magrittr_1.5 pROC_1.16.2 scales_1.1.1
[56] stringi_1.4.6 farver_2.0.3 reshape2_1.4.4 promises_1.1.1 timeDate_3043.102
[61] ellipsis_0.3.1 generics_0.0.2 vctrs_0.3.2 xgboost_1.1.1.1 lava_1.6.8
[66] iterators_1.0.13 tools_4.0.2 glue_1.4.1 purrr_0.3.4 fastmap_1.0.1
解决方法
我可能已经让它工作了。根据我使用的原始代码,这里是情节:
#load libraries
library(MASS)
library(lime)
library(randomForest)
#create data
var_1<- rnorm(100,1,4)
var_2 <-rnorm(10,10,5)
var_3<- c("0","2","4")
var_3 <- sample(var_3,100,replace=TRUE,prob=c(0.3,0.6,0.1))
response<- c("1","0")
response <- sample(response,0.7))
#put them into a data frame called "f"
f <- data.frame(var_1,var_2,var_3,response)
#declare var_3 and response_variable as factors
f$var_3 = as.factor(f$var_3)
f$response = as.factor(f$response)
# run random forest on all the data except the first observation
model<-randomForest(response ~.,data = f,mtry=2,ntree=100)
model<-as_classifier(model,labels = NULL)
#run the "lime" procedure on the first observation
explainer <- lime(f[-1,],model,bin_continuous = TRUE,quantile_bins = FALSE)
explanation <- explain(f[-1,explainer,n_labels = 1,n_features = 4)
#visualize the results - here is the error:
plot_features(explanation,ncol = 1)
#load libraries
library(MASS)
library(lime)
library(randomForest)
#create data
var_1<- rnorm(100,case =1:4,ncol = 1)
我不明白发生了什么变化 - 但至少现在显示了图形。假设我只对第一次观察感兴趣。我仍然很困惑这些行是否应该是:
explainer <- lime(f[-1,n_features = 4)
或
explainer <- lime(f,quantile_bins = FALSE)
explanation <- explain(f,n_features = 4)
我也不确定“概率”和“解释契合度”之间有什么区别。我假设“概率”是随机森林模型产生的概率,而“解释拟合”衡量的是 LIME 模型的“解释能力”。
(如果有人知道这个,可以在下面评论吗?谢谢)