使用熊猫根据日期时间分配时间范围

问题描述

我需要根据输入时间从主数据库中找到时间范围。 cust_id开始时间 0 1 2000-01-01 09:00:03 1 2 2000-01-01 18:01:03

我需要的输出是 cust_id开始时间时间表 0 1 2000-01-01 09:00:03早上 1 2 2000-01-01 18:01:03晚上

用于创建主时间表范围详细信息的代码 mastdf = {'timeframe':['morning','latemorning','midnoon','evening'],'start_time':['8:00:00','11:00:00','13:00 :00','17:00:00'],'end_time':['10:59:59','13:59:59','16:59:59','7:59:59'] }在此处输入代码

用于创建输入数据框的代码 inputdf = {'cust_id':[1,2],'starttime':['2000-01-01 09:00:03','2000-01-01 18:01:03']}

解决方法

使用cut进行分箱,但首先将to_timedelta的值转换为timedelta,使用添加端点24H创建分档,并且将00:00:008:00:00之间的时间范围设置为fillna被列timeframe的最后一个值使用:

mastdf={'timeframe':['morning','latemorning','midnoon','evening'],'start_time':['8:00:00','11:00:00','13:00:00','17:00:00'],'end_time':['10:59:59','13:59:59','16:59:59','7:59:59']}
mastdf = pd.DataFrame(mastdf)
print (mastdf)
     timeframe start_time  end_time
0      morning    8:00:00  10:59:59
1  latemorning   11:00:00  13:59:59
2      midnoon   13:00:00  16:59:59
3      evening   17:00:00   7:59:59

inputdf={'cust_id':[1,2],'starttime':['2000-01-01 09:00:03','2000-01-01 18:01:03']}
inputdf = pd.DataFrame(inputdf)
inputdf['starttime'] = pd.to_datetime(inputdf['starttime'])

start =  pd.to_timedelta(mastdf['start_time']).tolist() + [pd.Timedelta(24,unit='h')]
s = pd.to_timedelta(inputdf['starttime'].dt.strftime('%H:%M:%S'))
last = mastdf['timeframe'].iat[-1]
inputdf['timeframe'] = pd.cut(s,bins=start,labels=mastdf['timeframe'],right=False).fillna(last)
print (inputdf)
   cust_id           starttime timeframe
0        1 2000-01-01 09:00:03   morning
1        2 2000-01-01 18:01:03   evening