问题描述
我正在尝试使用R中的嵌套for循环在字符串中打印3个连续字符的组合。代码确实会打印组合,但是我得到一个错误,因为它仅对元素之一而不是数据帧中的每一行都这样做
x <- data.frame(Pattern = c("abcdef","hijklmnop"),id = 1:2)
output <- vector("character",length(x$Pattern))
for (i in 1:nrow(x)) {
file <- x$Pattern[i]
for (j in 1:(str_length(x$Pattern))) {
output[j] <- substr(file,j,j+2)
}
}
numerical expression has 2 elements: only the first usednumerical expression has 2 elements: only the first used
>
> output
[1] "hij" "ijk" "jkl" "klm" "lmn" "mno"
有2件事在这里发生,不起作用。一个是启动的var output
正在使用第一个模式的长度(长度= 6)并基于该长度打印组合,但是我正在寻找的输出是字符串的长度(长度= 9 )。下面的预期输出未使用嵌套的for循环。
for (j in 1:9) {
output[j] <- substr(file,j+2)
}
output
[1] "hij" "ijk" "jkl" "klm" "lmn" "mno" "nop" "op" "p"
我将其进一步缩小,以便每个字符串只有3个连续字符的组合列表。
list(output[1:(length(output)-3)])
[[1]]
[1] "hij" "ijk" "jkl" "klm" "lmn" "mno"
我遇到的第二个问题是输出仅输出列表中第二个字符串的组合。我尝试按照其他帖子中的建议将1:nrow(a)
更改为seq_along
和length(a)
,但这是行不通的。预期的输出如下。
a$combo <- output
a$combo
[1] c("abc","bcd","cde","def") c("hij","ijk","jkl","klm","lmn","mno")
解决方法
x <- data.frame(Pattern = c("abcdef","hijklmnop"),id = 1:2)
# number of additional letters in individual character string
add_letters = 2
library(stringr)
output = list()
for (i in 1:nrow(x)) {
file <- x$Pattern[i]
l = list()
for (j in 1:(str_length(x$Pattern[i])-add_letters)) {
l[j] <- c(substr(file,j,j+add_letters))
}
output[[i]] = l
}
x$combo = output
具有列表的解决方案-如Gregor Thomas所建议。