问题描述
我正在尝试按照我在以下链接中找到的代码对营业额进行套索回归:https://www.kaggle.com/acasalan/predict-bank-turnover-lasso-regression。
这样做时,在我的结果中有两件事对我来说似乎很奇怪:
- lambda.min 和 lambda.1se 相等;
- 混淆矩阵结果未显示正数。
代码如下:
# Split the data into training and test set
set.seed(123) # cercare significato di questo valore
training.samples <- Dati3$Dimissioni %>%
createDataPartition(p = 0.7,list = FALSE) # randomly split the data into training set (70% for building a predictive model) and test set (30% for evaluating the model)
train.data <- Dati3[training.samples,]
x <- model.matrix(Dimissioni~.,train.data)[,-1]
# Convert the outcome (class) to a numerical variable
y <- train.data$Dimissioni
#R function glmnet() [glmnet package] for computing penalized logistic regression.
glmnet(x,y,family = "binomial",alpha = 1,lambda = NULL)
# Find the best lambda using cross-validation
set.seed(123)
cv.lasso <- cv.glmnet(x,family = "binomial")
plot(cv.lasso) # The left dashed vertical line indicates that the log of the optimal value of lambda is approximately -5,which is the one that minimizes the prediction error.
cv.lasso$lambda.min # exact value of lambda
cv.lasso$lambda.1se # value of lambda that gives the simplest model but also lies within one standard error of the optimal value of lambda
# both the two methods results the same value: 0.008018156,# Using lambda.min as the best lambda,gives the following regression coefficients
coef(cv.lasso,cv.lasso$lambda.min)
# Final model with lambda.min (the same will be with lambda.1se)
lasso.model2 <- glmnet(x,lambda = cv.lasso$lambda.min)
# Make prediction on test data
x.test <- model.matrix(Dimissioni ~.,test.data)[,-1]
probabilities2 <- lasso.model2 %>% predict(newx = x.test)
predicted.classes2 <- ifelse(probabilities2 > 0.5,"pos","neg")
# Model accuracy
observed.classes2 <- test.data$Dimissioni
mean(predicted.classes2 == observed.classes2)
#confusion matrix
table(predicted.classes2,observed.classes2)
second <- table(predicted.classes2,observed.classes2)
# Precision or accuracy of predicting correctly employee turnover:
round(second[2,2]/ (second[2,2]+second[2,1]),4)
这些是混淆矩阵的结果:
感谢您的帮助。
解决方法
暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!
如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。
小编邮箱:dio#foxmail.com (将#修改为@)