问题描述
我正在努力使数据框具有多索引列。
data = pd.DataFrame({"name":["a","a","b","c","c"],"month":[1,1,2,3,3],"buy_sell":["sell","buy","sell","buy"],"value":[10,20,30,40,80,50,60]})
data
我想将其转换为宽格式。索引是name
,对于列,我想使用month
和buy_sell
或者如果不是多索引列,我想旋转数据框,以使列具有足够的值,例如sell_1
,buy_1
,sell_2
,buy_2
等...
任何帮助将不胜感激。谢谢!
解决方法
您可以使用set_index()
和unstack()
:
(data.set_index(['name','month','buy_sell'])['value']
.unstack(['month','buy_sell']))
输出:
month 1 2 3
buy_sell sell buy sell buy
name
a 10.0 20.0 NaN NaN
b 20.0 80.0 30.0 40.0
c 50.0 NaN NaN 60.0
如果您坚持将所有NaN
的列都包含在内,则可以一次unstack
一级:
(data.set_index(['name','buy_sell'])['value']
.unstack('month').unstack('buy_sell')
)
输出:
month 1 2 3
buy_sell buy sell buy sell buy sell
name
a 20.0 10.0 NaN NaN NaN NaN
b 80.0 20.0 NaN 30.0 40.0 NaN
c NaN 50.0 NaN NaN 60.0 NaN