根据R中的字符串替换字符向量的元素

问题描述

所以我有一个像这样的数据框:

Occupation
ELECTRICIAN
ROAD ELECTRICIAN
ELECTRICIAN
FARMER
GRASS ELECTRICIAN
POLE ELECTRICIANS
ELECTRICIAN
INSPECTOR

所以我希望任何时候电工看起来像电工,无论牢房中的其他任何东西。

因此最终产品应该是:

Occupation
ELECTRICIAN
ELECTRICIAN
ELECTRICIAN
FARMER
ELECTRICIAN
ELECTRICIAN
ELECTRICIAN
INSPECTOR

我尝试了以下操作,但这不起作用。...

ifelse(grep('CONDUCTOR',df$Occupation,value=TRUE),"CONDUCTOR",df$Occupation) 

解决方法

我建议采用这种方法。最好使用grepl(),因为它会产生易于在ifelse()中检测到的逻辑值:

#Data
df <- structure(list(Occupation = c("ELECTRICIAN","ROAD ELECTRICIAN","ELECTRICIAN","FARMER","GRASS ELECTRICIAN","POLE ELECTRICIANS","INSPECTOR")),row.names = c(NA,-8L),class = "data.frame")

代码:

#Code
df$Occupation <- ifelse(grepl('ELECTRICIAN',df$Occupation),'ELECTRICIAN',df$Occupation) 

输出:

   Occupation
1 ELECTRICIAN
2 ELECTRICIAN
3 ELECTRICIAN
4      FARMER
5 ELECTRICIAN
6 ELECTRICIAN
7 ELECTRICIAN
8   INSPECTOR
,

这是使用stringr软件包的tidyverse解决方案。

    library(stringr)      
    df$Occupation<- str_replace_all(df$Occupation,".*ELECTRICIAN.*","ELECTRICIAN")

关于是否优先使用tidyverse解决方案,存在一些争论,但我个人更喜欢它们。我认为函数名称对于您和任何可能正在阅读您的代码的人都更加直观。我也认为它简洁明了,可以按照您的意愿做。

,

使用grep可以在Occupation存在的"ELECTRICIAN"中获得索引,并替换这些值。

df$Occupation[grep('ELECTRICIAN',df$Occupation)] <- 'ELECTRICIAN'
df

#   Occupation
#1 ELECTRICIAN
#2 ELECTRICIAN
#3 ELECTRICIAN
#4      FARMER
#5 ELECTRICIAN
#6 ELECTRICIAN
#7 ELECTRICIAN
#8   INSPECTOR