如何重新编码序数变量?

问题描述

我正在使用 World Values Survey 的调查数据,我使用下面的代码将我的变量从数字更改为有序变量

renameddata$Education= ordered(renameddata$Education,levels =c(-2,-1,840001,840002,840003,840004,840005,840006,840007,840008,840009),labels = c("NA","NA","LessHighSchool","SomeHighSchool","GED","SomeCollege","Associates","Bachelors","Masters","Professional","Doctorate"))

但是,现在我想重新编码教育变量,使 LessHighSchoolSomeHighSchool 合而为一,例如 "NO GED",从而使 SomeCollegeAssociatesBachelors 变成 "Undergraduate" 等等

解决方法

这个怎么样:

library(dplyr)
renameddat <- renameddat %>% mutate(Education = 
        case_when(
          Education %in% c(840001,840002) ~ "No GED",Education == 840003 ~ "GED",Education %in% c(840004,840005,840006) ~ "Undergraduate",Education %in% c(840007,840008,840009) ~ "Graduate",TRUE ~ NA_character_),Education=factor(Education,levels=c("No GED","GED","Undergraduate","Graduate")))

,

或者,如果您想重新编码创建的因子变量,您可以使用 fct_collapse 包中的 forcats

输入:

renameddata <- data.frame(Education = c(-2,-1,840001,840002,840003,840004,840006,840007,840009))

renameddata$Education = ordered(renameddata$Education,levels = c(-2,840009),labels = c("NA","NA","LessHighSchool","SomeHighSchool","SomeCollege","Associates","Bachelors","Masters","Professional","Doctorate"))

重新编码:

library(forcats)
renameddata$Education <- fct_collapse(renameddata$Education,"NO GED" = c("LessHighSchool","SomeHighSchool"),"Undergraduate" = c("SomeCollege","Bachelors"))

给出:

       Education
1             NA
2             NA
3         NO GED
4         NO GED
5            GED
6  Undergraduate
7  Undergraduate
8  Undergraduate
9        Masters
10  Professional
11     Doctorate