R中是否有功能可以一次更改几个相似的因子水平?

问题描述

我在堆栈溢出时遇到的第二个问题,因此欢迎所有提示:)

对于临床研究,我必须重新编码许多二分基线特征,其中具有“是”和“否”的几种变化。

当前,我正在一对一地重新编码这些变量,但是它需要很多行代码,并且所有不同变量之间的差异都非常相似。如果未知或不适用,我想重新编码为0。

示例

library(dplyr)

A <- c("Yes","y","no","n","UK")
B <- c("yes","Yes","no")
C <- c("Y","uk")

#attempt 1 was to recode all variables one by one

A <- recode(A,"Yes" = "yes","y" = "yes","n" = "no","UK" = "no")
B <- recode (B,"y" = "yes")
C <- recode(C,"Y" = "yes","uk" = "no")

#attempt 2 was to use a list option on all vectors.

levels(A) <- list("yes"=c("Likely","Y","yes"),"no" = c("","No","UK","N","n"))

我想知道是否有办法对包含所有A,B,C的列表/向量执行此列表选项?也许还有另一种方式可以更容易,更有效地重新编码这些变量?

任何帮助都会很棒:)

解决方法

如果向量的长度相同,则可以将其放入数据帧中,或者如果向量的长度不同,则将它们放入列表中,然后使用lapply对所有向量应用相同的功能。您可以使用forcats::fct_collapse将多个级别折叠为一个级别。

list_vec <- list(A,B,C)

list_vec <- lapply(list_vec,function(x) forcats::fct_collapse(x,"yes"=c("Likely","y","Y","Yes","yes"),"no" = c("","No","UK","no","N","n","uk")))
,

您可以使用grepl从矢量中选择yesno

c("0","yes")[1 + grepl("^no?",A,TRUE) + 2*grepl("^ye?s?",TRUE)]
#[1] "yes" "yes" "no"  "no"  "0"  

要对许多矢量进行此处理,可以使用如下循环:

for(x in c("A","B","C")) {
  assign(x,c("0",get(x),TRUE) +
                              2*grepl("^ye?s?",TRUE)])
}
A
#[1] "yes" "yes" "no"  "no"  "0"  
B
#[1] "yes" "yes" "yes" "no"  "no" 
C
#[1] "yes" "yes" "no"  "no"  "0"