如何将cross函数与mutate和case_when结合起来以根据条件对多列中的值进行突变?

问题描述

我有一个人口统计数据集,其中包括一个家庭的年龄。这是通过调查收集的,允许参与者拒绝提供年龄。

结果是一个数据集,该数据集每行有一个家庭(每个家庭都有一个家庭ID码),并且各个家庭特征(例如列中的年龄)也是如此。拒绝的响应编码为“ R”,您可以使用以下代码重新创建示例:

df <- list(Household_ID = c("1A","1B","1C","1D","1E"),AGE1 = c("25","47","39","50","R"),AGE2 = c("66","23","71","R","16"),AGE3 = c("28","17","80"),AGE4 = c("81","22","48","59","R"))

df <- as_tibble(df)

> df
# A tibble: 5 x 5
  Household_ID AGE1  AGE2  AGE3  AGE4 
  <chr>        <chr> <chr> <chr> <chr>
1 1A           25    66    28    81   
2 1B           47    23    17    22   
3 1C           39    71    R     48   
4 1D           50    R     R     59   
5 1E           R     16    80    R 

出于我们的意图和目的,我们将“ R”重新编码为“ -9”,以便随后可以将AGE列的格式转换为整数,并进行分析。我们通常在其他软件中执行此操作,而我的目标是在R中复制此过程。

我设法使用以下代码做到了这一点:

df <- df %>% mutate(AGE1 = case_when(AGE1 == "R" ~ "-9",TRUE ~ as.character(AGE1)))
df <- df %>% mutate(AGE2 = case_when(AGE2 == "R" ~ "-9",TRUE ~ as.character(AGE2)))
df <- df %>% mutate(AGE3 = case_when(AGE3 == "R" ~ "-9",TRUE ~ as.character(AGE3)))
df <- df %>% mutate(AGE4 = case_when(AGE4 == "R" ~ "-9",TRUE ~ as.character(AGE4)))

鉴于这感觉很笨拙,我尝试使用mutate_if等找到解决方案,但读到它们已被cross()取代。因此,我尝试使用cross()复制此操作:

df <- df %>%
  mutate(across(AGE1:AEG4),~ (case_when(. == "R" ~ "-9")))

但是出现以下错误:

Error: Problem with `mutate()` input `..2`.
x Input `..2` must be a vector,not a `formula` object.
i Input `..2` is `~(case_when(. == "R" ~ "-9"))`.

曾经为此苦苦挣扎,现在搜索了一段时间,但无法弄清我的缺失。非常感谢您提供一些有关如何使它正常工作的意见,谢谢。

编辑:已解决!

df <- df %>%
  mutate(across(AGE1:AGE4,~ (case_when(.x == "R" ~ "-9",TRUE ~ as.character(.x)))))

解决方法

为什么不简单?

df[,2:5][df[,2:5] == 'R'] <- '-9'

# A tibble: 5 x 5
  Household_ID AGE1  AGE2  AGE3  AGE4 
  <chr>        <chr> <chr> <chr> <chr>
1 1A           25    66    28    81   
2 1B           47    23    17    22   
3 1C           39    71    -9    48   
4 1D           50    -9    -9    59   
5 1E           -9    16    80    -9
,

或者也许这个与亲爱的@TarJae的解释没有太大区别:

library(dplyr)
library(stringr)


df %>%
  mutate(across(AGE1:AGE4,~ str_replace(.,"R","-9")),across(AGE1:AGE4,as.integer))

# A tibble: 5 x 5
  Household_ID  AGE1  AGE2  AGE3  AGE4
  <chr>        <int> <int> <int> <int>
1 1A              25    66    28    81
2 1B              47    23    17    22
3 1C              39    71    -9    48
4 1D              50    -9    -9    59
5 1E              -9    16    80    -9

数据:

df <- list(Household_ID = c("1A","1B","1C","1D","1E"),AGE1 = c("25","47","39","50","R"),AGE2 = c("66","23","71","16"),AGE3 = c("28","17","80"),AGE4 = c("81","22","48","59","R"))

df <- as_tibble(df)
,

您可以将 acrossreplace 一起使用。

  1. as_tibble() 列出来
  2. 用 -9 替换 R
  3. AGE 的整数类
df %>% 
  as_tibble() %>% 
  mutate(across(everything(),~replace(.,. ==  "R","-9"))) %>% 
  type.convert(as.is=TRUE)

输出:

  Household_ID  AGE1  AGE2  AGE3  AGE4
  <chr>        <int> <int> <int> <int>
1 1A              25    66    28    81
2 1B              47    23    17    22
3 1C              39    71    -9    48
4 1D              50    -9    -9    59
5 1E              -9    16    80    -9

相关问答

依赖报错 idea导入项目后依赖报错,解决方案:https://blog....
错误1:代码生成器依赖和mybatis依赖冲突 启动项目时报错如下...
错误1:gradle项目控制台输出为乱码 # 解决方案:https://bl...
错误还原:在查询的过程中,传入的workType为0时,该条件不起...
报错如下,gcc版本太低 ^ server.c:5346:31: 错误:‘struct...