以固定间隔在 Python 中前向填充重采样数据

问题描述

我想根据以前的值(即,使用前向填充 data)以 10second 的间隔重新采样列 ffill

数据框 df 看起来像这样:

        Timestamp               data
850812  2011-01-26 17:53:39.250 28.5
394354  2011-01-26 17:53:42.250 NaN
554123  2011-01-26 17:54:09.400 NaN
1187196 2011-01-26 17:54:19.400 NaN
1067598 2011-01-26 17:54:21.400 NaN
463998  2011-01-26 17:55:34.030 29.5
231116  2011-01-26 17:56:26.030 30.5
567915  2011-01-26 17:56:35.030 30.5
839526  2011-01-26 17:56:37.030 30.5
174655  2011-01-26 17:56:41.590 29.0

可重现的例子:

from pandas import Timestamp
from numpy import nan

df = pd.DataFrame({'Timestamp': {850812: Timestamp('2011-01-26 17:53:39.250000'),394354: Timestamp('2011-01-26 17:53:42.250000'),554123: Timestamp('2011-01-26 17:54:09.400000'),1187196: Timestamp('2011-01-26 17:54:19.400000'),1067598: Timestamp('2011-01-26 17:54:21.400000'),463998: Timestamp('2011-01-26 17:55:34.030000'),231116: Timestamp('2011-01-26 17:56:26.030000'),567915: Timestamp('2011-01-26 17:56:35.030000'),839526: Timestamp('2011-01-26 17:56:37.030000'),174655: Timestamp('2011-01-26 17:56:41.590000')},'data': {850812: 28.5,394354: nan,554123: nan,1187196: nan,1067598: nan,463998: 29.5,231116: 30.5,567915: 30.5,839526: 30.5,174655: 29.0}}
)

我试过了:

df1 = (df.set_index('Timestamp')['data']
                .resample('10S')
                .last()
                .ffill()
                .reset_index())
df1

返回:

    Timestamp           data
0   2011-01-26 17:53:30 28.5
1   2011-01-26 17:53:40 28.5
2   2011-01-26 17:53:50 28.5
3   2011-01-26 17:54:00 28.5
4   2011-01-26 17:54:10 28.5
5   2011-01-26 17:54:20 28.5
6   2011-01-26 17:54:30 28.5
7   2011-01-26 17:54:40 28.5
8   2011-01-26 17:54:50 28.5
9   2011-01-26 17:55:00 28.5
10  2011-01-26 17:55:10 28.5
11  2011-01-26 17:55:20 28.5
12  2011-01-26 17:55:30 29.5  # Should be 28.5
13  2011-01-26 17:55:40 29.5
14  2011-01-26 17:55:50 29.5
15  2011-01-26 17:56:00 29.5
16  2011-01-26 17:56:10 29.5
17  2011-01-26 17:56:20 30.5  # Should be 29.5
18  2011-01-26 17:56:30 30.5
19  2011-01-26 17:56:40 29.0  # Should be 30.5

在表格右侧的注释中,我标记了应该不同的边际值。我想在重新采样数据时复制最后一个数据,而不是下一个最近的数据。为什么要取下一个最近的数据?

解决方法

暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!

如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@)