问题描述
我的任务
我需要将 HTML 标签添加到 R 中字符串中的特定单词(每个场合),以使大写字母保持大写。
我的尝试
第一种方法识别所有单词,但由于替换包括结果字符串中的小写字母,所有字母也都是小写:
x = "Some random text with some,issues"
gsub(pattern = "some",replacement = "<>some<>",x = x,ignore.case = TRUE)
[1] "<>some<> random text with <>some<>,issues"
我在某处找到了另一种使用函数的方法,该方法保留大写字母但不识别逗号或点所伴随的单词(在此示例中,标记仅添加到第一个“some”):
tagger <- function(text,word,tag) {
x <- unlist(strsplit(text,split = " ",fixed = TRUE))
x[tolower(x) == tolower(word)] <- paste0(tag,x[tolower(x) == tolower(word)],tag)
paste(x,collapse = " ")
}
tagger(text = x,word = "some",tag = "<>")
[1] "<>Some<> random text with some,issues"
想要的结果
我怎样才能得到一个看起来像 1 或 2 的字符串?
[1] "<>Some<> random text with <>some<>,issues"
[2] "<>Some<> random text with <>some,<> issues"
解决方法
也许这就是您要找的:
tagger <- function(text,word,tag) {
gsub(pattern = paste0("(",")(\\.|,)?"),replacement = paste0(tag,"\\1\\2",tag),x = text,ignore.case = TRUE)
}
x <- "Some random text with some,issues"
tagger(x,"some","<>")
#> [1] "<>Some<> random text with <>some,<> issues"