UseMethod("accuracy") 中的错误:没有适用于“accuracy”的方法应用于“c('double','numeric')”类的对象

问题描述

我试图预测 r 中二手车数据的价格。我已经完成了所有的预处理并将数据分为训练集和测试集。在这里我使用回归树。当我试图获得准确性时,我遇到了这个错误

library(rpart)
library(tidyverse)
library(dplyr)
dput(head(train.df,5))

reg_tree <- rpart(price ~ .,data = train.df,method = "anova",minbucket = 1,maxdepth = 30,cp = 0.001)

accuracy(predict(reg_tree,train.df),train.df$price)
structure(list(price = 33990,year = 2018L,manufacturer = structure(1L,.Label = c("acura","alfa-romeo","aston-martin","audi","bmw","buick","cadillac","chevrolet","chrysler","datsun","dodge","ferrari","fiat","ford","gmc","harley-davidson","honda","hyundai","infiniti","jaguar","jeep","kia","land rover","lexus","lincoln","mazda","mercedes-benz","mercury","mini","mitsubishi","nissan","pontiac","porsche","ram","rover","saturn","subaru","tesla","toyota","volkswagen","volvo"),class = "factor"),condition = structure(4L,.Label = c("excellent","fair","good","like new","new","salvage"),cylinders = structure(6L,.Label = c("10 cylinders","12 cylinders","3 cylinders","4 cylinders","5 cylinders","6 cylinders","8 cylinders","other"),fuel = structure(3L,.Label = c("diesel","electric","gas","hybrid",odometer = 22267,title_status = structure(1L,.Label = c("clean","lien","missing","parts only","rebuilt",transmission = structure(1L,.Label = c("automatic","manual",drive = structure(2L,.Label = c("4wd","fwd","rwd"),size = structure(3L,.Label = c("compact","full-size","mid-size","sub-compact"),type = structure(4L,.Label = c("bus","convertible","coupe","hatchback","mini-van","offroad","other","pickup","sedan","SUV","truck","van","wagon"),paint_color = structure(10L,.Label = c("black","blue","brown","custom","green","grey","orange","purple","red","silver","white","yellow"),class = "factor")),row.names = 31113L,class = "data.frame")


Error in UseMethod("accuracy") : 
  no applicable method for 'accuracy' applied to an object of class "c('double','numeric')"

有人能帮我吗。

提前致谢。

解决方法

假设我不知道 accuracy 函数来自哪个包(可能是 MLmetrics::Accuracy??),但是,错误是由于您使用的关于类型的度量问题:准确性用于分类问题,其中结果通常是只能具有某些值(离散变量)的 factor。由于您的结果类别 (numeric),您在这里拟合了一个回归模型。汽车的价格可以在一定范围内不断变化。因此,对于回归问题,评估模型性能最常用的指标之一是均方根误差 (RMSE)。 RMSE 函数在 caret 包中实现。在这里,我发布了一个带有来自包 `rpart:

的内置数据集 cars.test.frame 的示例
library(rpart)
library(tidyverse)
library(dplyr)
library(caret)

data("car.test.frame")
ind <- createDataPartition(car.test.frame$Price,p=.8,list=F)
train.df <- car.test.frame[ind,]
test.df <- car.test.frame[-ind,]

reg_tree <- rpart(Price ~ .,data = train.df,method = "anova",minbucket = 1,maxdepth = 30,cp = 0.001)



rmse <- RMSE(predict(reg_tree,test.df),test.df$Price)
rmse_perc <- rmse/mean(test.df$Price)*100

RMSE 可以报告为平均汽车价格的百分比。由于其易于计算,您还可以实现自己的 rmse 函数:

rmse <- function (y_pred,y_true) 
{
  RMSE <- sqrt(mean((y_true - y_pred)^2))
  return(RMSE)
}

但是,上面的 rmse 函数与 RMSE 包的 caret 相同