问题描述
我正在尝试使用 R 中的条件推理树来获得一种基于使用 ctree 获得的类型/拆分预测的反事实分布。
我正在使用以下代码:
#Trying firstly the ctree on one country
de_fact <- subset(ess,ess$cntry=="IT")
#Keep only the needed variables
xvars <- c("fisei","misei","edu_father","edu_mother","gender","emprf14","emprm14")
yvar <- "isei_respondent"
de_fact <- de_fact[!is.na(de_fact$isei_respondent),c(yvar,xvars)]
#Split the data in train and test
set.seed(123)
ind <- sample(2,nrow(de_fact),replace=T,prob=c(0.7,0.3))
train <- de_fact[ind==1,]
test <- de_fact[ind==2,]
框架总结如下:
isei_respondent fisei misei edu_father
Min. :16.00 Min. :16.00 Min. :16.00 <= Primary :800
1st Qu.:30.00 1st Qu.:26.00 1st Qu.:23.00 Lower II :315
Median :40.00 Median :36.00 Median :39.00 Upper II :173
Mean :41.86 Mean :37.67 Mean :39.44 Post-II non-III: 0
3rd Qu.:52.00 3rd Qu.:47.50 3rd Qu.:49.00 Tertiary : 81
Max. :90.00 Max. :88.00 Max. :80.00 NA's : 34
NA's :177 NA's :959
edu_mother gender emprf14
<= Primary :926 Female:645 Employee :857
Lower II :272 Male :758 Self-employed:484
Upper II :148 Not work : 9
Post-II non-III: 0 Dead/Absent : 21
Tertiary : 26 NA's : 32
NA's : 31
emprm14
Employee :297
Self-employed:186
Not work :867
Dead/Absent : 21
NA's : 32
我正在训练数据上拟合 ctree 并对测试进行如下预测:
try <- ctree(isei_respondent ~ .,data=train,control=ctree_control(maxsurrogate=3,mincriterion = 0.99))
try
info_node(node_party(try))
predict(try,newdata = test)
然而,对于国家 IT,我在预测长度和测试数据长度方面没有匹配。具体来说,对于405个观察的测试数据,我只有108个预测。知道我做错了什么以及导致这种不匹配的原因是什么吗?
感谢支持!
解决方法
暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!
如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。
小编邮箱:dio#foxmail.com (将#修改为@)