问题描述
您好,我想为我的数据集开发一个多元线性回归方程。我有 2 个独立的连续变量和 24 个数据点(具有因变量结果)。我实际上已经使用 lm( y~ x1 + x2,data=df)
函数创建了一个回归方程,但我想使用机器学习并将数据拆分为训练集和测试集。然后我想使用我获得的最佳模型来预测 PM 浓度。当我查看网站时,我总是会找到包含分类变量的示例,但我没有看到任何使用连续变量的示例。
我怎样才能找到一个很好的例子,我应该遵循什么路径?
谢谢
编辑:这是我尝试过的事情之一。
> library(caret)
> dim(df)
[1] 24 3
> head(df)
# A tibble: 6 x 3
PM SO2 Wind_speed
<dbl> <dbl> <dbl>
1 108. 7.57 0.9
2 115. 9.45 0.9
3 74.5 13.4 1.2
4 77.2 7.73 1.3
5 57.0 5.08 1.3
6 52.6 8.59 1.2
> summary(df)
PM SO2 Wind_speed
Min. : 52.64 Min. : 4.090 Min. :0.600
1st Qu.: 76.84 1st Qu.: 7.397 1st Qu.:0.800
Median :105.69 Median : 9.265 Median :1.000
Mean :118.62 Mean :17.089 Mean :1.004
3rd Qu.:158.02 3rd Qu.:15.070 3rd Qu.:1.200
Max. :261.84 Max. :75.270 Max. :1.500
> control <- trainControl(method="cv",number=10)
> summary(control)
Length Class
method 1 -none-
number 1 -none-
repeats 1 -none-
search 1 -none-
p 1 -none-
initialWindow 0 -none-
horizon 1 -none-
fixedWindow 1 -none-
skip 1 -none-
verboseIter 1 -none-
returnData 1 -none-
returnResamp 1 -none-
savePredictions 1 -none-
classprobs 1 -none-
summaryFunction 1 -none-
selectionFunction 1 -none-
preProcoptions 6 -none-
sampling 0 -none-
index 0 -none-
indexOut 0 -none-
indexFinal 0 -none-
timingSamps 1 -none-
predictionBounds 2 -none-
seeds 1 -none-
adaptive 4 -none-
trim 1 -none-
allowParallel 1 -none-
Mode
method character
number numeric
repeats logical
search character
p numeric
initialWindow NULL
horizon numeric
fixedWindow logical
skip numeric
verboseIter logical
returnData logical
returnResamp character
savePredictions logical
classprobs logical
summaryFunction function
selectionFunction character
preProcoptions list
sampling NULL
index NULL
indexOut NULL
indexFinal NULL
timingSamps numeric
predictionBounds logical
seeds logical
adaptive list
trim logical
allowParallel logical
> metric <- "Accuracy"
> set.seed(7)
> fit.lda <- train(PM~.,data=df,method="lda",metric=metric,trControl=control)
Hata: wrong model type for regression
>set.seed(7)
> fit.cart <- train(PM~.,data= df,method="rpart",trControl=control)
Hata: Metric Accuracy not applicable for regression models
解决方法
暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!
如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。
小编邮箱:dio#foxmail.com (将#修改为@)