问题描述
我想分析多家公司的每月股票回报(面板数据)。但是,我一直在努力计算公司最近X个月的标准偏差。
基本上,我想在现有data.frame中添加另一列,其中显示了根据公司移动X个月的窗口的标准差。请在下面的简化示例中找到我的数据以及我希望实现的目标。
#My data:
company = c("1","1","2","3","4","4")
return = c(0.01,0.015,-0.01,0.02,0.023,-0.04,-0.02,0.05,0.06,0.03,-0.09,0.2,0.3,-0.04)
stock = data.frame(company,return)
鉴于这种初始情况,我希望根据另一种方法在另一列中计算标准差。 3个观察结果。
#Column to be filled with the respective value
stock["std_3obs"] = NA
#However,I do not manage to fill this column accordingly. The following result for a given row is expected:
#row 1 = Not possible,as there are not enough prior observations available
#row 2 = Not possible,as there are not enough prior observations available
#row 3 = sd(c(0.01,-0.01) = 0.01322876
#row 7 = Not possible,as there are not enough prior observations available
#row 8 = sd(c(-0.040,-0.020,-0.010)) = 0.01527525
非常感谢!任何帮助深表感谢!请保持温柔,因为我对R还不熟悉。
* 侧面注释: 对此问题进行研究并采用其他解决方案总是会导致此错误:替换具有X行,数据具有Y *,其中X >>> Y
解决方法
您可以使用zoo
包中的滚动功能:
library(dplyr)
stock %>%
group_by(company) %>%
mutate(std_3obs = zoo::rollapplyr(return,3,sd,fill = NA))
# company return std_3obs
# <chr> <dbl> <dbl>
# 1 1 0.01 NA
# 2 1 0.015 NA
# 3 1 -0.01 0.0132
# 4 1 0.02 0.0161
# 5 1 0.023 0.0182
# 6 2 -0.04 NA
# 7 2 -0.02 NA
# 8 2 -0.01 0.0153
# 9 2 0.05 0.0379
#10 2 0.06 0.0379
#11 2 0.03 0.0153
#12 2 -0.09 0.0794
#13 3 0.2 NA
#14 3 0.3 NA
#15 3 -0.04 0.175
#16 3 -0.02 0.191
#17 4 -0.01 NA
#18 4 0.023 NA
#19 4 -0.04 0.0315
,
这是一种data.table
方法
library(data.table)
setDT(stock)[,std_3obs := frollapply(return,sd),by = company]
输出
> stock[]
company return std_3obs
1: 1 0.010 NA
2: 1 0.015 NA
3: 1 -0.010 0.01322876
4: 1 0.020 0.01607275
5: 1 0.023 0.01824829
6: 2 -0.040 NA
7: 2 -0.020 NA
8: 2 -0.010 0.01527525
9: 2 0.050 0.03785939
10: 2 0.060 0.03785939
11: 2 0.030 0.01527525
12: 2 -0.090 0.07937254
13: 3 0.200 NA
14: 3 0.300 NA
15: 3 -0.040 0.17473790
16: 3 -0.020 0.19078784
17: 4 -0.010 NA
18: 4 0.023 NA
19: 4 -0.040 0.03151190