如何在statsmodels库中使用ccf方法?

问题描述

我在(Python)ccf()库中使用statsmodels方法遇到麻烦。等效操作在R中可以正常工作。

在我的示例中,

ccf在两个变量AB之间产生了互相关函数。我很想了解AB领先指标的程度。

我正在使用以下内容

import pandas as pd
import numpy as np
import statsmodels.tsa.stattools as smt

我可以如下模拟AB

np.random.seed(123)
test = pd.DataFrame(np.random.randint(0,25,size=(79,2)),columns=list('AB'))

运行ccf时,我得到以下信息:

ccf_output = smt.ccf(test['A'],test['B'],unbiased=False)
ccf_output    
array([ 0.09447372,-0.12810284,0.15581492,-0.05123683,0.23403344,0.0771812,0.01434263,0.00986775,-0.23812752,-0.03996113,-0.14383829,0.0178347,0.23224969,0.0829421,0.14981321,-0.07094772,-0.17713121,0.15377192,-0.19161986,0.08006699,-0.01044449,-0.04913098,0.06682942,-0.02087582,0.06453489,0.01995989,-0.08961562,0.02076603,0.01085041,-0.01357792,0.17009109,-0.07586774,-0.0183845,-0.0327533,-0.19266634,-0.00433252,-0.00915397,0.11568826,-0.02069836,-0.03110162,0.08500599,0.01171839,-0.04837527,0.10352341,-0.14512205,-0.00203772,0.13876788,-0.20846099,0.30174408,-0.05674962,-0.03824093,0.04494932,-0.21788683,0.00113469,0.07381456,-0.04039815,0.06661601,-0.04302084,0.01624429,-0.00399155,-0.0359768,0.10264208,-0.09216649,0.06391548,0.04904064,-0.05930197,0.11127125,-0.06346119,-0.08973581,0.06459495,-0.09600202,0.02720553,0.05152299,-0.0220437,0.04818264,-0.02235086,-0.05485135,-0.01077366,0.02566737])

这是我想要达到的结果(用R表示):

enter image description here

问题是这样的:ccf_output仅给我滞后0和滞后0右侧的相关值。理想情况下,我想要整套滞后值(滞后-60至滞后60)这样我就可以生成类似上面的图。

有没有办法做到这一点?

解决方法

statsmodels ccf函数仅产生前向滞后,即对于k> = 0的Corr(x_ [t + k],y_ [t])。但是,计算向后滞后的一种方法是反转顺序输入序列和输出中的一个。

backwards = smt.ccf(test['A'][::-1],test['B'][::-1],adjusted=False)[::-1]
forwards = smt.ccf(test['A'],test['B'],adjusted=False)
ccf_output = np.r_[backwards[:-1],forwards]

请注意,backwardsforwards都包含滞后0,因此在组合它们时我们必须从其中之一中删除滞后。

编辑,另一种方法是颠倒参数和输出的顺序:

backwards = sm.tsa.ccf(test['B'],test['A'],adjusted=False)[::-1]