我可以将因子用作mutate中的搜索和替换值吗?

问题描述

我有一个值列表,无论值出现在哪里,我都想用另一个列表中的值替换。因此,例如,无论“白色”作为发色出现在什么地方,我都想使用“浅色”。每当出现“ auburn”的地方,我都想使用“ brown。我可以使用:

find_text = c("white","auburn","none")
replace_text = c("light","brown","bald")

starwars %>%
  filter(gender == "feminine") %>% 
  select(c("name","hair_color","species")) %>%
  mutate(hair_color = str_replace(hair_color,find_text[1],replace_text[1]),hair_color = str_replace(hair_color,find_text[2],replace_text[2]),find_text[3],replace_text[3]),)

我以为我可以使用fct_recode(),但这似乎也需要一个命名的字符串。 有没有更清洁的方法可以解决这个问题?

解决方法

我们可以使用命名向量替换为str_replace。这里假设我们要替换子字符串,即匹配并替换子字符串,而不是固定字符串替换

library(dplyr)
library(stringr)
starwars %>%
   filter(gender == "feminine") %>% 
   select(c("name","hair_color","species")) %>%
   mutate(hair_color_new = str_replace_all(hair_color,set_names(replace_text,find_text)))
# A tibble: 17 x 4
#   name               hair_color species    hair_color_new
#   <chr>              <chr>      <chr>      <chr>         
# 1 Leia Organa        brown      Human      brown         
# 2 Beru Whitesun lars brown      Human      brown         
# 3 Mon Mothma         auburn     Human      brown         
# 4 Shmi Skywalker     black      Human      black         
# 5 Ayla Secura        none       Twi'lek    bald          
# 6 Adi Gallia         none       Tholothian bald          
# 7 Cordé              brown      Human      brown         
# 8 Luminara Unduli    black      Mirialan   black         
# 9 Barriss Offee      black      Mirialan   black         
#10 Dormé              brown      Human      brown         
#11 Zam Wesell         blonde     Clawdite   blonde        
#12 Taun We            none       Kaminoan   bald          
#13 Jocasta Nu         white      Human      light         
#14 R4-P17             none       Droid      bald          
#15 Shaak Ti           none       Togruta    bald          
#16 Rey                brown      Human      brown         
#17 Padmé Amidala      brown      Human      brown    

如果我们想进行固定匹配,那么recode也很有用

starwars %>%
    filter(gender == "feminine") %>% 
    select(c("name","species"))  %>% 
    mutate(hair_color_new = recode(hair_color,!!! set_names(replace_text,find_text)))
# A tibble: 17 x 4
#   name               hair_color species    hair_color_new
#   <chr>              <chr>      <chr>      <chr>         
# 1 Leia Organa        brown      Human      brown         
# 2 Beru Whitesun lars brown      Human      brown         
# 3 Mon Mothma         auburn     Human      brown         
# 4 Shmi Skywalker     black      Human      black         
# 5 Ayla Secura        none       Twi'lek    bald          
# 6 Adi Gallia         none       Tholothian bald          
# 7 Cordé              brown      Human      brown         
# 8 Luminara Unduli    black      Mirialan   black         
# 9 Barriss Offee      black      Mirialan   black         
#10 Dormé              brown      Human      brown         
#11 Zam Wesell         blonde     Clawdite   blonde        
#12 Taun We            none       Kaminoan   bald          
#13 Jocasta Nu         white      Human      light         
#14 R4-P17             none       Droid      bald          
#15 Shaak Ti           none       Togruta    bald          
#16 Rey                brown      Human      brown         
#17 Padmé Amidala      brown      Human      brown      
,

这里有一个使用join的选项。

library(tidyverse)

replacetext <- data.frame(
  find_text = c("white","auburn","none"),replace_text = c("light","brown","bald"),stringsAsFactors = F)

starwars %>%
  filter(gender == "female") %>% 
  select(c("name","species")) %>% 
  left_join(replacetext,by = c("hair_color" = "find_text")) %>% 
  mutate(replace_text = coalesce(replace_text,hair_color)) %>% 
  select(-hair_color) %>% 
  rename(hair_color = replace_text)