计算多个组的均值

问题描述

我有一张桌子

Sex     Value1   Value2    City
M       2        1         Berlin
W       3        5         Paris
W       1        3         Paris
M       2        5         Berlin
M       4        2         Paris

我想计算不同组的Value1和Value2的平均值。在我的原始数据集中，我有10个组变量（最多具有5个特征，如5个城市），在本示例中，我将其简化为Sex和City（2个特征）。结果应该像这样

       Avgoverall   AvgM    AvgW    AvgBerlin    AvgParis
Value1 2,4          2,6     2       2            2,66   
Value2 3,2          2,6     4       3            3,3

我熟悉group by并尝试过

df.groupby('City').mean()

但是这里我们有一个问题，Sex也进入了计算。有谁知道如何解决这个问题？预先感谢！

解决方法

您可以将2列分为2个数据帧，然后将concat与数字列一起使用（不包括非数字）：

df1 = df.groupby('City').mean().T
df2 = df.groupby('Sex').mean().T

df3 = pd.concat([df.mean().rename('Overall'),df2,df1],axis=1).add_prefix('Avg')
print (df3)
        AvgOverall      AvgM  AvgW  AvgBerlin  AvgParis
Value1         2.4  2.666667   2.0        2.0  2.666667
Value2         3.2  2.666667   4.0        3.0  3.333333

pandas pandas-groupby python