问题描述
我正在尝试计算窗口大小为7的熊猫数据框中所有时间序列值的加权移动平均值。由于某些原因,它从第7个观察开始,对前6个观察返回NaN。返回正确的值。尽管第7个值是正确的,但我的方法肯定存在问题。有关如何纠正的任何建议或想法?预先谢谢你!
df=pd.DataFrame(data)
def calc_wma(df,wd_size,weights=1):
"""
Takes in a series and calculates the WMA with a window size of wd_size
"""
wma = None
if isinstance(weights,int):
weights = np.full(wd_size,weights)
assert len(weights) == wd_size,"Q4: The size of the weights must be the same as the window size. "
weights = np.arange(1,wd_size + 1)
wma=df.rolling(7).apply(lambda cases: np.dot(cases,weights)/weights.sum(),raw=True)
return wma
calc_wma(df,7)
Out:
2020-01-23 NaN
2020-01-24 NaN
2020-01-25 NaN
2020-01-26 NaN
2020-01-27 NaN
2020-01-28 NaN
2020-01-29 1034.107143
2020-01-30 1350.714286
2020-01-31 1503.250000
2020-02-01 1710.071429
2020-02-02 2518.607143
2020-02-03 2769.714286
2020-02-04 3166.750000
2020-02-05 3448.714286
解决方法
您无法计算出样本的前6天的7天移动平均值,这就是为什么输出的前6天缺少值的原因。