我有这种数据框:
Variable Date Value
0 Variable1 Date1 Valeur 1
1 Variable1 Date2 Valeur 2
2 Variable1 Date3 Valeur 3
3 Variable2 Date4 Valeur 4
4 Variable2 Date5 Valeur 5
我想这样改变它:
Date Variable1 Variable2
0 Date1 Valeur 1 None
1 Date2 Valeur 2 None
2 Date3 Valeur 3 None
3 Date4 None Valeur 4
4 Date5 None Valeur 5
如何在Python中使用panda或numpy进行这种转换?
谢谢你的帮助
解决方法:
我认为您需要pivot
和rename_axis
(熊猫0.18.0中有新功能)和reset_index
:
print df.pivot(index='Date', columns='Variable', values='Value')
.rename_axis(None, axis=1)
.reset_index()
Date Variable1 Variable2
0 Date1 Valeur 1 None
1 Date2 Valeur 2 None
2 Date3 Valeur 3 None
3 Date4 None Valeur 4
4 Date5 None Valeur 5
样品:
import pandas as pd
df = pd.DataFrame({'Variable': {0: 'a', 1: 'a', 2: 'a', 3: 'b', 4: 'b'},
'Date': {0: pd.Timestamp('2016-02-05 00:00:00'),
1: pd.Timestamp('2016-02-06 00:00:00'),
2: pd.Timestamp('2016-02-07 00:00:00'),
3: pd.Timestamp('2016-02-08 00:00:00'),
4: pd.Timestamp('2016-02-09 00:00:00')},
'Value': {0: 1, 1: 2, 2: 3, 3: 4, 4: 5}},
columns=['Variable','Date','Value'])
print df
Variable Date Value
0 a 2016-02-05 1
1 a 2016-02-06 2
2 a 2016-02-07 3
3 b 2016-02-08 4
4 b 2016-02-09 5
print df.pivot(index='Date', columns='Variable', values='Value')
.rename_axis(None, axis=1)
.reset_index()
Date a b
0 2016-02-05 1.0 NaN
1 2016-02-06 2.0 NaN
2 2016-02-07 3.0 NaN
3 2016-02-08 NaN 4.0
4 2016-02-09 NaN 5.0