问题描述
我进行了编码练习:创建一个“函数”,然后使用“ for”来按ID和日期计算累计值。我只能使用cumsum()
,因此,不允许所有软件包。
例如,我在下面创建一个数据框
df=data.frame("date"=c("1/1/2020","1/1/2020","2/1/2020","3/1/2020","3/1/2020"),"id"=c("A","B","C","A","C"),"val"=c(5,6,7,8,4,5,3,4))
解决方法
我否决了您的问题,因为SO不是免费的编码服务。但是,您的问题很简单,有很多方法可以解决此问题。我必须对您的DF进行一些修复:
df = data.frame(
"date" = as.Date(c(
"1/1/2020","1/1/2020","2/1/2020","3/1/2020","3/1/2020"
),format = "%d/%m/%Y"),"id" = c("A","B","C","A","C"),"val" = c(5,6,7,8,4,5,3,4),stringsAsFactors = FALSE
)
接着(一个dplyr
示例,只是许多方式之一):
library(dplyr)
summary_df <- df %>%
group_by(date,id) %>%
summarise(sum = cumsum(val))
结果:
> summary_df
# A tibble: 9 x 3
# Groups: date [3]
date id sum
<date> <chr> <dbl>
1 2020-01-01 A 5
2 2020-01-01 B 6
3 2020-01-01 C 7
4 2020-01-02 A 8
5 2020-01-02 B 4
6 2020-01-02 C 5
7 2020-01-03 A 6
8 2020-01-03 B 3
9 2020-01-03 C 4