问题描述
我有一个包含客户交易的 Pandas 数据框,如下所示,并创建了一个名为“标签”的列,其中包含 2 个不同的值
-
在上一笔交易结束日期之前执行的新交易
-
在上一笔交易结束日期之后进行的新交易
输入
Transaction ID Transaction Start Date Transaction End Date
1 23-jun-2014 15-Jul-2014
2 14-jul-2014 8-Aug-2014
3 13-Aug-2014 22-Aug-2014
4 21-Aug-2014 28-Aug-2014
5 29-Aug-2014 05-Sep-2014
6 06-Sep-2014 15-Sep-2014
期望输出
Transaction ID Transaction Start Date Transaction End Date Label
1 23-jun-2014 15-Jul-2014
2 14-jul-2014 8-Aug-2014 New Transaction performed before end date of prevIoUs transaction
3 13-Aug-2014 22-Aug-2014 New Transaction after the end date of prevIoUs transaction.
4 21-Aug-2014 28-Aug-2014 New Transaction performed before the end date of prevIoUs transaction.
5 29-Aug-2014 05-Sep-2014 New Transaction after the end date of prevIoUs transaction.
6 06-Sep-2014 15-Sep-2014 New Transaction after the end date of prevIoUs transaction.
解决方法
使用 numpy.where
和 Series.shift
:
import numpy as np
df['Label'] = np.where(df['Transaction Start Date'].lt(df['Transaction End Date'].shift()),'New Transaction performed before end date of previous transaction','New Transaction after the end date of previous transaction.')
,
首先使用 to_datetime
,然后使用 numpy.where
和 Series.lt
形成由 Series.shift
减少压缩移位的值,最后将第一个值设置为空字符串:
df['Transaction End Date'] = pd.to_datetime(df['Transaction End Date'])
df['Transaction Start Date'] = pd.to_datetime(df['Transaction Start Date'])
df['Label'] = np.where(df['Transaction Start Date'].lt(df['Transaction End Date'].shift()),'New Transaction after the end date of previous transaction.')
df.loc[0,'Label'] = ''
替代解决方案:
m = df['Transaction Start Date'].lt(df['Transaction End Date'].shift())
df['Label'] = [''] + np.where(m,'New Transaction after the end date of previous transaction.')[1:].tolist()
print (df)
Transaction ID Transaction Start Date Transaction End Date \
0 1 2014-06-23 2014-07-15
1 2 2014-07-14 2014-08-08
2 3 2014-08-13 2014-08-22
3 4 2014-08-21 2014-08-28
4 5 2014-08-29 2014-09-05
5 6 2014-09-06 2014-09-15
Label
1 New Transaction performed before end date of p...
2 New Transaction after the end date of previous...
3 New Transaction performed before end date of p...
4 New Transaction after the end date of previous...
5 New Transaction after the end date of previous...