R语言朴素贝叶斯分类器的predict()下标类型'list'无效

问题描述

我的课程要求我使用 Udacity 的 enron 财务数据在 R 中制作财务欺诈检测模型。

我写了一个计算函数（split_train_set 只是将数据分成 1 70-30 个训练和测试集。

library(e1071)
library(caret)

nb_runner <- function(dataset,rm.na=FALSE) {
  split_df <- split_train_set(dataset,rm.na)
  nb <- naiveBayes(x=split_df$x_train_set,y=split_df$y_train_set$poi)
  nb_predict <- predict(nb,newdata=split_df$x_test_set,type='class')
  cm <- confusionMatrix(nb_predict,split_df$y_test_set$poi,positive='True')
  return(cm)
}

一开始效果很好。但是，在我尝试通过以下代码删除超过 15 个 NA 的行来清理数据后，并重新运行相同的 nb_runner()

remove_high_na <- function(dataset,threshold = 0.7) {
  # The range of NA in rows is 2 to 17
  # Since we have only 22 features in the dataset,high level of NA makes the col useless
  # Hence,we will remove rows with high level of NA,and we will set the threshold as 0.7.
  # The row with NA higher than 0.7 (> 15.6) will be removed. 
  threshold_cols <- floor(ncol(dataset) * threshold)
  df <- subset(dataset,rowSums(is.na(dataset)) <= threshold_cols)
  # df <- dataset[-which(rowSums(is.na(dataset)) > threshold_cols),]
  return(df)
}

Error in object$levels[apply(L,2,which.max)] : 
  invalid subscript type 'list' 
The code Failed and the traceback is as follows:
4.
factor(object$levels[apply(L,which.max)],levels = object$levels) 
3.
predict.naiveBayes(nb,newdata = split_df$x_test_set,type = "class") 
2.
predict(nb,type = "class") at POI_helpers.R#38
1.
nb_runner(df_1)

我不太确定我做错了什么，因为相同的数据集在其他分类器中运行良好。预先感谢您的帮助。

解决方法

暂无找到可以解决该程序问题的有效方法，小编努力寻找整理中！

如果你已经找到好的解决方法，欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@）

naivebayes r r