使用多索引注释散点图

问题描述

我已经使用具有多索引的DataFrame中的数据构建了散点图。指标是国家和年份

fig,ax=plt.subplots(1,1)
rel_pib=welfare["rel_pib_pc"].loc[:,1960:2010].groupby("country").mean()
rel_lambda=welfare["Lambda"].loc[:,1960:2010].groupby("country").mean()
ax.scatter(rel_pib,rel_lambda)
ax.set_ylim(0,2)
ax.set_ylabel('Bienestar(Lambda)')
ax.set_xlabel('PIBPc')
ax.plot([0,1],'red',linewidth=1)

我想用国家名称（可能的话用Lambda值）注释每个点。我有以下代码

for i,txt in enumerate(welfare.index):
    plt.annotate(txt,(welfare["rel_pib_pc"].loc[:,1960:2010].groupby("country").mean()[i],welfare["Lambda"].loc[:,1960:2010].groupby("country").mean()[i]))

我不确定如何指示我想要国家名称，因为给定国家/地区的所有lambda和pib_pc值都作为单个值给出，因为我使用的是{{ 1}}功能。

我尝试使用.mean()，但尝试的所有组合均无效。

解决方法

我使用了以下测试数据：

               rel_pib_pc  Lambda
country  year                    
Country1 2007         260    1.12
         2008         265    1.13
         2009         268    1.10
Country2 2007         230    1.05
         2008         235    1.07
         2009         236    1.04
Country3 2007         200    1.02
         2008         203    1.07
         2009         208    1.05

然后，使用以下代码生成散点图：

fig,ax = plt.subplots(1,1)
ax.scatter(rel_pib,rel_lambda)
ax.set_ylabel('Bienestar(Lambda)')
ax.set_xlabel('PIBPc')
ax.set_xlim(190,280)
annot_dy = 0.005
for i,txt in enumerate(rel_lambda.index):
    ax.annotate(txt,(rel_pib.loc[txt],rel_lambda.loc[txt] + annot_dy),ha='center')
plt.show()

并得到以下结果：

正确生成注释的技巧是：

枚举已经生成的 Series 对象之一的索引，因此 txt 包含国家/地区名称。
从已经生成的 Series 对象中获取值（请勿计算这些值再次。）
通过当前索引值定位两个坐标。
要将这些注释放在各个要点的上方，请使用：
- ha （水平对齐）为“中心” ，
- 将 y 向上移动一点（如果需要，请与他人一起尝试 annot_dy 的值。

我还添加了 ax.set_xlim（190,280），以便将注释保留在图片矩形。也许您将不需要它。

multi-index pandas python scatter-plot