问题描述

我想连接一个NumPy计算的多维输出，该多维输出在尺寸上匹配输入的形状（关于行和相应的选定列）。但是失败，显示为：NotImplementedError: Can only union MultiIndex with MultiIndex or Index of tuples,try mi.to_flat_index().union(other) instead。

我不想先弄平索引-那么还有另一种方法可以使它工作吗？

import pandas as pd
from pandas import Timestamp

df = pd.DataFrame({('metrik_0',Timestamp('2020-01-01 00:00:00')): {(1,1): 2.5393693602911447,(1,5): 4.316896324314225,6): 4.271001191238499,9): 2.8712588011247377,11): 4.0458495954752545},('metrik_0',Timestamp('2020-01-01 01:00:00')): {(1,1): 4.02779063729038,5): 3.3849606155101224,6): 4.284114856052976,9): 3.980919941298365,11): 5.042488191587525},Timestamp('2020-01-01 02:00:00')): {(1,1): 2.374592085569529,5): 3.3405503781564487,6): 3.4049690284720366,9): 3.892686173978996,11): 2.1876998087043127}})

def compute_return_columns_to_df(df,colums_to_process,axis=0):
    method = 'compute_result'
    renamed_base_levels = map(lambda x: f'{x}_{method}',colums_to_process.get_level_values(0).unique())
    renamed_columns = colums_to_process.set_levels(renamed_base_levels,level=0)

    #####
    # perform calculation in numpy here
    # for the sake of simplicity (and as the actual computation is irrelevant - it is omitted in this minimal example)
    result = df[colums_to_process].values
    #####
    
    result = pd.DataFrame(result,columns=renamed_columns)
    display(result)    
    return pd.concat([df,result],axis=1) # fails with: NotImplementedError: Can only union MultiIndex with MultiIndex or Index of tuples,try mi.to_flat_index().union(other) instead.

# I do not want to flatten the indices first - so is there another way to get it to work?

compute_return_columns_to_df(df[df.columns[0:3]].head(),df.columns[0:2])

解决方法

代码失败的原因是：

result = df[colums_to_process].values
result = pd.DataFrame(result,columns=renamed_columns)

请注意，结果具有：

列名称已重命名为 metrik_0_compute_result （到目前为止还可以），
但行索引是默认的单级索引，由连续数字组成。

然后，当您连接 df 和结果时， Pandas 尝试在行索引上对齐两个源DataFrame，但是它们不兼容（ df 具有MultiIndex，而 result 具有“普通”索引）。

将代码的这一部分更改为：

result = df[colums_to_process]
result.columns = renamed_columns

这样 result 保留原始索引，而 concat 不引发

另一句话：您的函数包含 axis 参数，该参数为没用过。考虑删除它。

另一种可能的方法

由于结果具有默认（单级）索引，因此您可以保留代码的前一部分保持不变，但在加入之前重置 df 中的索引：

return pd.concat([df.reset_index(drop=True),result],axis=1)

这样，两个DataFrame都具有相同的索引，您可以串联他们也是。

multi-index numpy pandas python

熊猫将具有相同行索引的多索引列串联在一起

问题描述

解决方法

另一种可能的方法