问题描述
示例:
dates = [
('2021-03-01','2021-03-31'),('2021-04-01','2021-05-15'),# overlaps
('2021-07-01','2021-11-31'),('2021-01-01','2021-02-28'),('2021-05-01','2021-05-31'),# overlaps
]
预期结果:
overlapped_dates = [
('2021-04-01',]
解决方法
假设我们将 11 月 31 日确定为一个真实日期,例如 11 月 30 日,我们可以使用 Pandas 来通过按开始日期排序来执行此操作,并检查开始日期小于上一个结束日期或结束日期的行日期大于下一个开始日期。
import pandas as pd
dates = [
('2021-03-01','2021-03-31'),('2021-04-01','2021-05-15'),('2021-07-01','2021-11-30'),('2021-01-01','2021-02-28'),('2021-05-01','2021-05-31'),]
df = pd.DataFrame(dates,columns=['start','end'])
df['start'] = pd.to_datetime(df['start'])
df['end'] = pd.to_datetime(df['end'])
df = df.sort_values(by='start')
df.loc[(df['start'].lt(df['end'].shift())) | (df['end'].gt(df['start'].shift(-1)))].astype(str).values
输出
array([['2021-04-01','2021-05-15'],['2021-05-01','2021-05-31']],dtype=object)