问题描述
我已经在堆栈溢出中搜索了与我所面临的问题相似并且卡住的示例,因此我们将不胜感激!我有一个与以下数据帧相似的数据帧:
df <- data.frame( "ID" = c(rep(1,6),rep(2,rep(3,5),rep(4,5)),"A" = c(0,rep(0,4),1,3),2),3)),"count" = NA)
,我想编辑“ count”变量,以便数据框如下所示:
df2 <- data.frame( "ID" = c(rep(1,"count" = c(NA,NA,-3:-1,1:2,-1,1:3,NA ))
在每个df$ID
中,当df$A = 1
时,我需要df$count = 1
。另外,我需要df$count
从1:3开始计数,从-1:-3开始计数,省略零,这样就产生了df2。任何帮助表示赞赏!
解决方法
您可以编写一个函数,该函数为您提供所需的顺序:
library(dplyr)
add_num <- function(x) {
#Get the index of 1
inds <- which(x == 1)
#Create a sequence with that index as 0
num <- lapply(inds,function(i) {
num <- seq_along(x) - i
#Add 1 to values greater than equal to 0
num[num >= 0] <- num[num >= 0] + 1
num[num < -3 | num > 3] <- NA
num
})
#Select the first non-NA values from the sequence
do.call(coalesce,num)
}
现在每个ID
都应用此功能:
df %>% group_by(ID) %>% mutate(count = add_num(A))
# ID A count
#1 1 0 NA
#2 1 0 NA
#3 1 0 -3
#4 1 0 -2
#5 1 0 -1
#6 1 1 1
#7 1 0 2
#8 1 0 3
#9 1 0 NA
#...
#...
#46 4 0 NA
#47 4 0 NA
#48 4 0 -3
#49 4 0 -2
#50 4 0 -1
#51 4 1 1
#52 4 0 2
#53 4 0 3
#54 4 0 NA
#55 4 0 -3
#56 4 0 -2
#57 4 0 -1
#58 4 1 1
#59 4 0 2