如何使用时间序列中的值创建时间间隔的数据框？

问题描述

我有一个CSV文件，其中包含一长串的血糖（BG）值和相关的时间戳。我正在尝试使用BG

    Timestamp   glucose
0   2020-02-24 17:45:23 4.7
1   2020-02-24 17:50:23 4.9
2   2020-02-24 17:55:22 4.9
3   2020-02-24 18:00:22 4.8
4   2020-02-24 18:05:21 4.7
... ... ...
2348    2020-03-03 19:25:38 4.8
2349    2020-03-03 19:30:38 4.7
2350    2020-03-03 19:35:38 4.7
2351    2020-03-03 19:40:38 4.5
2352    2020-03-03 19:45:38 4.2
2353 rows × 2 columns

然后，我使用下面的代码尝试生成间隔。但是，它只给我间隔5分钟（1个值的长度）的间隔。我认为这是因为我有代码index+1来关闭我的current_interval，而我需要的是一个从index+1 to index+len(time_series)开始的循环，但我不知道该怎么做。任何帮助，不胜感激。下面的代码：

THRESHOLD = 3.5

IntervalRow = namedtuple(
    'IntervalRow',['start_time','start_bg','end_time','end_bg','lowest_bg']
)

def is_hypo(value):
  return value < THRESHOLD

def calculate_hypo_intervals(time_series):
    intervals = []
    current_interval = None

    for index in range(len(time_series)):
        if is_hypo(time_series['glucose'][index]):
            if not current_interval:
                    current_interval = IntervalRow(
                    start_time=time_series['Timestamp'][index],start_bg=time_series['glucose'][index],end_time=None,end_bg=None,lowest_bg=time_series['glucose'][index],)
            
           
            if index+1 < len(time_series) and current_interval.lowest_bg > time_series['glucose'][index+1]:
                current_interval = IntervalRow(
                    start_time=current_interval.start_time,start_bg=current_interval.start_bg,lowest_bg=time_series['glucose'][index+1],)
      
            
            if index+1 < len(time_series) and not is_hypo(time_series['glucose'][index+1]):
                intervals.append(
                    IntervalRow(
                        start_time=current_interval.start_time,end_time=time_series['Timestamp'][index+1],end_bg=time_series['glucose'][index+1],lowest_bg=current_interval.lowest_bg,)
                )

# I appreciate this bit is probably not very code savvy and is only there for the final data point.
# suggestions to mix it with the if loop above welcomed. Reason I seperated it was because if I 
# left it as before where it read "if index == len(time_series) and not is_hypo" then either all 
# intervals have to end with a value that is still hypo or you get an Index error
            if index+1 == len(time_series):
              intervals.append(
                    IntervalRow(
                        start_time=current_interval.start_time,end_time=time_series[index].timestamp,end_bg=time_series['glucose'][index],lowest_bg=current_interval.lowest_bg
                    )
                )

            current_interval = None
                
    df2 = pd.DataFrame(intervals,columns =['Start Time','Start BG','End Time','End BG','Lowest BG'])

    return df2

这给了我以下内容，但不包括（例如）比第一个间隔早的BG


Start Time  Start BG    End Time    End BG  Lowest BG
0   2020-02-25 10:10:23 3.1 2020-02-25 10:15:24 3.6 3.1
1   2020-02-25 11:05:23 3.4 2020-02-25 11:10:23 3.7 3.4
2   2020-02-25 14:35:25 3.1 2020-02-25 14:40:25 3.5 3.1
3   2020-02-25 18:25:26 3.3 2020-02-25 18:30:26 3.9 3.3
4   2020-02-27 09:45:20 3.4 2020-02-27 09:50:20 3.6 3.4
5   2020-02-27 12:50:19 3.4 2020-02-27 12:55:19 3.6 3.4
6   2020-02-27 17:35:20 3.4 2020-02-27 17:40:19 3.6 3.4
7   2020-02-28 10:05:22 3.4 2020-02-28 10:10:22 3.5 3.4
8   2020-02-28 18:35:23 3.4 2020-02-28 18:40:24 3.6 3.4
9   2020-02-29 11:15:26 3.4 2020-02-29 11:20:26 3.5 3.4
10  2020-02-29 16:15:27 3.4 2020-02-29 16:20:27 3.5 3.4
11  2020-02-29 21:10:28 3.4 2020-02-29 21:15:27 3.5 3.4
12  2020-03-01 13:55:31 3.4 2020-03-01 14:00:30 3.6 3.4
13  2020-03-01 17:45:29 3.4 2020-03-01 17:50:31 3.5 3.4
14  2020-03-02 12:45:34 3.3 2020-03-02 12:50:34 3.6 3.3
15  2020-03-02 16:30:34 3.4 2020-03-02 16:35:34 3.5 3.4
16  2020-03-03 17:50:38 3.4 2020-03-03 17:55:38 3.5 3.4

解决方法

暂无找到可以解决该程序问题的有效方法，小编努力寻找整理中！

如果你已经找到好的解决方法，欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@）

dataframe intervals pandas pandas python time-series