使熊猫具有多个索引列的多个数据框并完全连接

问题描述

有人会说这需要两个单独的问题,但是它们是相互关联的,所以我只在这里写下它们。

1。制作多索引列

我有三个数据框:

data_large = pd.DataFrame({"name":["a","b","c"],"sell":[10,60,50],"buy":[20,30,40]})
data_mini = pd.DataFrame({"name":["b","c","d"],"sell":[60,20,10],"buy":[30,50,40]})
data_topix = pd.DataFrame({"name":["a",80,0],"buy":[70,40]})

但是首先,我想使它们的列像下面这样多索引。

enter image description here

这是我尝试过的方法,但未按预期工作。 name处于索引级别Nikkei225Large

iterables = [['Nikkei225Large'],['name','buy','sell']]
index_large = pd.MultiIndex.from_product(iterables,names=['product','sell_buy'])
data_large.columns = index_large

enter image description here

2。例如,将具有多个索引列的多个熊猫连接起来。使用reduce

接下来,在列name上将三个数据帧完全外部联接。预期输出为:

enter image description here

就目前而言,我只是使用reduce来加入他们,如下所示,但我想使用多索引列。

from functools import reduce
dfs = {0: data_large,1: data_mini,2: data_topix}

def agg_df(dfList):
    df_agged = reduce(lambda left,right: pd.merge(left,right,left_index=True,right_index=True,on='name',how='outer'),dfList)
    return df_agged

df_final = agg_df(dfs.values())

任何帮助将不胜感激!

解决方法

IIUC,您可以使用带有pd.concat参数的keys

df_out = pd.concat([dfi.set_index('name') for dfi in [data_large,data_mini,data_topix]],keys=['Nikkei225Large','Nikkei225Mini','Topix'],axis=1)\
           .rename_axis(index=['Name'],columns=['product','buy_sell'])

输出:

product  Nikkei225Large       Nikkei225Mini       Topix      
buy_sell           sell   buy          sell   buy  sell   buy
Name                                                         
a                  10.0  20.0           NaN   NaN  10.0  70.0
b                  60.0  30.0          60.0  30.0  80.0  30.0
c                  50.0  40.0          20.0  50.0   0.0  40.0
d                   NaN   NaN          10.0  40.0   NaN   NaN

相关问答

错误1:Request method ‘DELETE‘ not supported 错误还原:...
错误1:启动docker镜像时报错:Error response from daemon:...
错误1:private field ‘xxx‘ is never assigned 按Alt...
报错如下,通过源不能下载,最后警告pip需升级版本 Requirem...