冗余图例:Matplotlib

问题描述

我的散点图有多余的图例。这是我的情节的图像。

enter image description here

关于这个问题,我已经在StackOverflow上检查了以下现有问题: too many legend with array column data in matplotlib

尽管如此,它没有帮助。我想我遇到了一个完全不同的问题。 请告诉我如何解决此问题。

这是我的代码

import matplotlib.cm as cm
colors = cm.rainbow(np.linspace(0,1,N_Clus))
cluster_labels_2 = list(range(1,N_Clus+1))
print("cluster_labels: ",cluster_labels_2)
# Create a figure
plt.figure(figsize=(15,8))
s=0
for color,label in zip(colors,np.asarray(cluster_labels_2).flatten()):
    subset = WorkingDF2[WorkingDF2.Cluster == label]    
    for i in subset.index:
        x=np.asarray(subset["Standardized COVID-19 Index"][i]).flatten()
        y=np.asarray(subset["Standardized CSS Index"][i]).flatten() 
        plt.text(x,y,str(subset['Neighbourhood'][i]),rotation=25) 
        s += 1
        plt.scatter(x,c=np.array([color]),label='cluster'+str(label),alpha=0.5)
plt.legend(loc='lower right',fontsize=15)
plt.xlabel('Standardized COVID-19 Index',fontsize=18)
plt.ylabel('Standardized CSS Index',fontsize=18)
plt.title("[Hierarchical Clustering: {} Cluster] \n 
 Mapping of Non-Outlier Neighbourhoods \n 
 onto Standardized css-COVID19 Indices Space \n
 ".format(N_Clus),fontsize=18)
print('# of Neighbours: ',s)

解决方法

问题出在那行

plt.scatter(x,y,c=np.array([color]),label='cluster'+str(label),alpha=0.5)

在这里,即使已存在彩色标签,您仍要给彩色圆点添加标签'cluster' + str(label),因此plt.legend()将创建许多相同的图例元素。我会跟踪以前的标签,如果不是新标签,则将当前图的标签设置为None,以便plt.legend()忽略它。

请注意,您的命名选择可能会有些混乱,因为matplotlib使用“标签”作为图例中出现的曲线的名称,而将其用作簇号。我们可以称之为cluster_number吗?

这里是实现:

import matplotlib.cm as cm

colors = cm.rainbow(np.linspace(0,1,N_Clus))
cluster_labels_2 = list(range(1,N_Clus+1))
print("cluster_labels: ",cluster_labels_2)

# Create a figure.
plt.figure(figsize=(15,8))
s=0
clusters_already_in_the_legend = []
for color,cluster_number in zip(colors,np.asarray(cluster_labels_2).flatten()):
    subset = WorkingDF2[WorkingDF2.Cluster == cluster_number]    
    for i in subset.index:
        x = np.asarray(subset["Standardized COVID-19 Index"][i]).flatten()
        y = np.asarray(subset["Standardized CSS Index"][i]).flatten() 
        plt.text(x,str(subset['Neighbourhood'][i]),rotation=25) 
        s += 1

        # Keeping track of the labels so that we don't legend them multiple times.
        if cluster_number not in clusters_already_in_the_legend:
            clusters_already_in_the_legend.append(cluster_number)
            label = f"Cluster {cluster_number}"
        else:
            label = None
        plt.scatter(x,label=label,alpha=0.5)

plt.legend(loc='lower right',fontsize=15)
plt.xlabel('Standardized COVID-19 Index',fontsize=18)
plt.ylabel('Standardized CSS Index',fontsize=18)
plt.title("[Hierarchical Clustering: {} Cluster] \n 
 Mapping of Non-Outlier Neighbourhoods \n 
 onto Standardized CSS-COVID19 Indices Space \n
 ".format(N_Clus),fontsize=18)
print('# of Neighbours: ',s)