多元自适应回归样条中选择项的数量

问题描述

我正在基于6个预测因子对土壤容重进行回归分析。我尝试了插入符号包中的多元自适应回归样条。结果表明,最优模型的最终值为nprune = 8且度=1。但是,当我提取模型系数时,仅选择了7个项(包括截距)。谁能再解释一下自适应回归样条的两个调整参数导致结果(nprune和degree的最终值与所选项的数量和R所示的交互复杂度不匹配)?代码和结果如下所示:

model.bulk <- train(BD ~.,data.bulk,trControl = trainControl(method ="repeatedcv",number = 10,repeats = 10),method = "earth",metric = "RMSE")

最优模型的最终值为nprune = 8,度= 1

Multivariate Adaptive Regression Spline 

86 samples
 6 predictor

No pre-processing
resampling: Cross-Validated (10 fold,repeated 10 times) 
Summary of sample sizes: 77,78,76,77,... 
resampling results across tuning parameters:

  nprune  RMSE       Rsquared   MAE       
   2      0.1236698  0.3078943  0.10330518
   8      0.1080978  0.4794419  0.08786858
  14      0.1087380  0.4707099  0.08853226

Tuning parameter 'degree' was held constant at a value of 1
RMSE was used to select the optimal model using the smallest value.
The final values used for the model were nprune = 8 and degree = 1.

最终模型中只有7个选定的术语吗?

Call: earth(x=matrix[86,6],y=c(1.405,1.596,1...),keepxy=TRUE,degree=1,nprune=8)

              coefficients
(Intercept)     1.20609922
h(1.7-OC)       0.09059255
h(2.50917-Iw)  -0.08033033
h(SAND-43.2)    0.00483245
h(CLAY-5.6)     0.17138133
h(CLAY-6.71)   -0.17448152
h(SSQ-2.5)      0.07798563

Selected 7 of 16 terms,and 5 of 6 predictors
Termination condition: Reached nk 21
Importance: Iw,CLAY,SSQ,SAND,OC,D-unused
Number of terms at each degree of interaction: 1 6 (additive model)
GCV 0.01027685    RSS 0.6368061    GRSq 0.4947044    RSq 0.6273052

非常感谢您!

解决方法

暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!

如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@)