以小时为单位的 DatetimeIndex 到整数

问题描述

我有一个用 groupby 函数分组的数据框。为此,我不得不使用 DatetimeIndex。但是,我想将我的 datetimeindex 转换为整数以将其用作动态优化模型的索引。我可以将我的日期时间索引转换为浮点数而不是整数微分小时数。


# My data look like this:

[                           Date  Hour  MktDemand   HOEP  hour  
Datetime                                                        
2019-01-01 01:00:00  2019-01-01     1      16231   0.00     0   
2019-01-01 02:00:00  2019-01-01     2      16051   0.00     1   
2019-01-01 03:00:00  2019-01-01     3      15805  -0.11     2   
2019-01-01 04:00:00  2019-01-01     4      15580  -1.84     3   
2019-01-01 05:00:00  2019-01-01     5      15609  -0.47     4   
...



import datetime as dt

df['Datetime'] = pd.to_datetime(df.Date) + pd.to_timedelta(df.Hour,unit='h')
df['datetime'] = pd.to_datetime(df.Date) + pd.to_timedelta(df.Hour,unit='h')
grouped = df.set_index('Datetime').groupby(pd.Grouper(freq="15d"))


for name,group in grouped:
    print(pd.to_numeric(group.index,downcast='integer'))

# It returns this:
Int64Index([1546304400000000000,1546308000000000000,1546311600000000000,1546315200000000000,1546318800000000000,1546322400000000000,1546326000000000000,1546329600000000000,1546333200000000000,1546336800000000000,...

# However,I would like to have integers in this format:

20190523
20190524

# I tried this but it doesn't work:
for name,group in grouped:
    print(pd.to_timedelta(group.index).dt.total_hours().astype(int))


ERROR: dtype datetime64[ns] cannot be converted to timedelta64[ns]

解决方法

您期望的整数表示日期时间格式;它们不是日期时间的实际数字表示(pd.to_numeric 为您提供,自 1970-1-1 UTC 以来的纳秒数)。

因此,您需要格式化为字符串,然后转换为整数。

例如:

import pandas as pd
# some synthetic example data...
dti = pd.date_range("2015","2016",freq='d')
df = pd.DataFrame({'some_value': [i for i in range(len(dti))]})
grouped = df.set_index(dti).groupby(pd.Grouper(freq="15d"))

for name,group in grouped:
    print(group.index.strftime('%Y%m%d').astype(int))
    
# gives you e.g.
Int64Index([20150101,20150102,20150103,20150104,20150105,20150106,20150107,20150108,20150109,20150110,20150111,20150112,20150113,20150114,20150115],dtype='int64')
...

您还可以扩展 issue30526 以提供额外的参数,例如小时或分钟。