使用split拆分大小不同的字符串

问题描述

所以我想将一个字符串变量分割成几部分，但是我将它们拆分为多个子字符串的长度不同，并且没有像这样的分隔符。，|等等。所以我从一个像这样的数据帧开始：

df <- data.frame(x=c("bigApe","smallApe","bigDog","smallDog"),c(1,2,5,3))
x         y
bigApe    1
smallApe  2
bigDog    5
smallDog  3

我希望它像这样结束：

  size  anim  y
1 big   Ape   1
2 small Ape   2
3 big   Dog   5
4 small Dog   3

我已经看过使用split（）进行的事情，看起来它们应该能够做到，但是它们似乎都在寻找可预测的分隔符/空格或设置的子字符串长度。我可以做为正则表达式来查找大写字母，但是它不能保留以下字母：

df %>% separate(x,c("size","anim"),sep="[A-Z]")
   size anim num
1   big   pe   1
2 small   pe   2
3   big   og   5
4 small   og   3

我正在寻找的数据没有那个。我想我可以在stringr中添加一些内容，但是即使在这里，我发现的所有内容似乎都需要指定的字符串长度。我当然可以拼凑出一个可怕的for循环，但是肯定有比这更快的方法！

谢谢！

解决方法

您需要这个：

.image-container {position: relative;}
.image-container img{
   position: absolute;
   animation-name: multiple-image-crossfade;
   animation-timing-function: ease-in-out;
   animation-iteration-count: infinite;
   animation-duration: 120s;
}
.image-container img:nth-of-type(1) {
   animation-delay: 90s;
}
.image-container img:nth-of-type(2) {
   animation-delay: 60s;
}
.image-container img:nth-of-type(3) {
   animation-delay: 30s;
}
.image-container img:nth-of-type(4) {
   animation-delay: 0s;
}

df %>% separate(x,c("size","anim"),sep = "(?!^)(?=[[:upper:]])")

我不确定您是否可以使用单独的分隔符来保留定界符...不过，您可以使用stringr::str_locate()查找大写字母的开始位置，然后使用substr（以及一些{{ 1}}魔术）：

dplyr

您还可以使用基本R函数gsub来使用正则表达式组解析原始列。

df$size <- gsub("([a-z]*)([A-Z]?[a-z]*)","\\1",df$x)
df$animal <- gsub("([a-z]*)([A-Z]?[a-z]*)","\\2",df$x)

r r stringr tidyr