确定 R 中字符串中的重复数字数据

问题描述

我正在尝试识别具有重复编号序列的数据框中的列值。例如

我想返回 66046,66861 因为 6 连续出现。我已经尝试了以下...

df %>% filter(str_detect(as.String(df[1]),"[66]"))  #with and without the squared brackets.
df[unlist(gregexpr("[6]{2}[[:digit:]]",df[1])),][1]

很明显，这行不通。任何帮助表示赞赏。

谢谢

解决方法

我们可以用

指定计数

type Y = { call: Y -> (int -> int) }

let fibonacci n =
    let makeF f: int -> int =
        fun x ->
            if x = 0 then 0 else if x = 1 then 1 else f(x - 1) + f(x - 2)
    let y = { call = fun y -> fun x -> (makeF (y.call y)) x }
    (y.call y) n

-输出

library(dplyr)
library(stringr)
df %>%
   filter(str_detect(ColA,"6{2,}"))

数据

#   ColA
#1 66046
#5 66861

使用

library(dplyr)
library(stringr)
df %>%
   filter(str_detect(ColA,"(\\d)\\1"))

见proof

节点	解释
`(`	分组并捕获到\1：
`\d`	数字 (0-9)
`)`	\1 结束
`\1`	捕获匹配的内容 \1

聚会迟到了，但 base R 有一个解决方案：

df[which(grepl("(\\d)\\1",df$ColA)),]

detect r r regex regex regex string string

确定 R 中字符串中的重复数字 数据

问题描述

解决方法

数据

相关问答

确定 R 中字符串中的重复数字数据