套索回归问题:lambda 和混淆矩阵

问题描述

我正在尝试按照我在以下链接中找到的代码对营业额进行套索回归:https://www.kaggle.com/acasalan/predict-bank-turnover-lasso-regression

这样做时,在我的结果中有两件事对我来说似乎很奇怪:

  1. lambda.min 和 lambda.1se 相等;
  2. 混淆矩阵结果未显示正数。

代码如下:

# Split the data into training and test set
set.seed(123) # cercare significato di questo valore
training.samples <- Dati3$Dimissioni %>% 
  createDataPartition(p = 0.7,list = FALSE) # randomly split the data into training set (70% for building a predictive model) and test set (30% for evaluating the model)
train.data <- Dati3[training.samples,]

x <- model.matrix(Dimissioni~.,train.data)[,-1]
# Convert the outcome (class) to a numerical variable
y <- train.data$Dimissioni
#R function glmnet() [glmnet package] for computing penalized logistic regression.

glmnet(x,y,family = "binomial",alpha = 1,lambda = NULL)

# Find the best lambda using cross-validation
set.seed(123) 
cv.lasso <- cv.glmnet(x,family = "binomial")
plot(cv.lasso) # The left dashed vertical line indicates that the log of the optimal value of lambda is approximately -5,which is the one that minimizes the prediction error. 

cv.lasso$lambda.min # exact value of lambda
cv.lasso$lambda.1se # value of lambda that gives the simplest model but also lies within one standard error of the optimal value of lambda
# both the two methods results the same value: 0.008018156,# Using lambda.min as the best lambda,gives the following regression coefficients
coef(cv.lasso,cv.lasso$lambda.min)

# Final model with lambda.min (the same will be with lambda.1se)
lasso.model2 <- glmnet(x,lambda = cv.lasso$lambda.min)

# Make prediction on test data
x.test <- model.matrix(Dimissioni ~.,test.data)[,-1]
probabilities2 <- lasso.model2 %>% predict(newx = x.test)
predicted.classes2 <- ifelse(probabilities2 > 0.5,"pos","neg")

# Model accuracy
observed.classes2 <- test.data$Dimissioni
mean(predicted.classes2 == observed.classes2)

#confusion matrix 
table(predicted.classes2,observed.classes2)
second <- table(predicted.classes2,observed.classes2)

# Precision or accuracy of predicting correctly employee turnover:
round(second[2,2]/ (second[2,2]+second[2,1]),4)

这些是混淆矩阵的结果:

enter image description here

感谢您的帮助。

解决方法

暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!

如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@)