尝试从glmnet模型中提取系数会返回NULL或“类型必须为“原始”或“概率””错误

问题描述

我正在内置infert数据集上的插入符中运行glmnet模型,例如

infert_y <- factor(infert$case) %>% plyr::revalue(c("0"="control","1"="case"))
infert_x <- subset(infert,select=-case)
new.x <- model.matrix(~.,infert_x)
    
# Create cross-validation folds:
myFolds <- createFolds(infert_y,k = 10)


# Create reusable trainControl object:
myControl_categorical <- trainControl(
  summaryFunction = twoClassSummary,classprobs = TRUE,# IMPORTANT!
  verboseIter = TRUE,savePredictions = TRUE,index = myFolds
)


model_glmnet_pca <- train(
  x = new.x,y = infert_y,metric = "ROC",method = "glmnet",preProcess=c("zv","nzv","medianImpute","center","scale","pca"),trControl = myControl_categorical,tuneGrid= expand.grid(alpha= seq(0,1,length = 20),lambda = seq(0.0001,length = 100))
)

但是当我尝试获取系数时:

bestlambda <- model_glmnet_pca$results$lambda[model_glmnet_pca$results$ROC == max(model_glmnet_pca$results$ROC)]

coef(model_glmnet_pca,s=bestlambda)

返回:

NULL

我尝试过:

coef.glmnet(model_glmnet_pca,s=bestlambda)

返回:

Error in predict.train(object,s = s,type = "coefficients",exact = exact,: 
  type must be either "raw" or "prob"

但是可以肯定,当我调用coef()时,我的“类型”参数设置为“系数”吗?如果我尝试

coef.glmnet(model_glmnet_pca,s=bestlambda,type="prob")

它返回:

Error in predict.train(object,: 
  formal argument "type" matched by multiple actual arguments

我很困惑,有人可以指出我在做什么错吗?

解决方法

要从最佳模型中获取系数,可以使用:

coef(model_glmnet_pca$finalModel,model_glmnet_pca$finalModel$lambdaOpt)

例如参见this link on using regularised regression models with caret