定制的字符串标签到matplotlib中的颜色的有序映射

问题描述

我在数据框中有带字符串标签的2D数据：

df = pd.DataFrame(data,columns = ['dim1','dim2','label'])

标签是按顺序排列的字符串，例如'small'，'small-medium，'medium'，'medium-big'，'big'（出于问题目的而简化）。

我想将数据绘制在散点图上，以使颜色反映顺序（因此，我将使用感知上统一的顺序色图）。

当前，这就是我所拥有的，它只是绘制数据点并根据其标签为它们着色：

groups = df.groupby('label')

fig = plt.figure(figsize=[20,20])
ax = fig.add_subplot(111)

for name,group in groups:
    ax.plot(group.dim1,group.dim2,label=name,marker='o',linestyle='',markersize=12)
ax.legend(fontsize=20)

如何调整代码，使其能够执行我想要的工作？

解决方法

只需指定绘制数据点的顺序即可使图例标签排序。

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd


data = {'dim1':  range(1,7),'dim2': range(11,17),'label': [ 'small','small-medium','medium','medium-big','big','small']
        }

df = pd.DataFrame(data,columns = ['dim1','dim2','label'])

groups = df.groupby('label')

fig = plt.figure(figsize=[20,20])
ax = fig.add_subplot(111)

labels = ['small','big']
labels.reverse()
colors = plt.get_cmap('inferno').colors
step = len(colors) // len(labels)


for i,label in enumerate(labels):
    for name,group in groups:
        if label == name:
            ax.plot(group.dim1,group.dim2,label=name,marker='o',linestyle='',markersize=12,color=colors[i*step])

ax.legend(fontsize=20)

plt.show()

我使用的是天真的版本，可以平均地从列表中获取元素，有关更多信息，您可以参考Select N evenly spaced out elements in array,including first and last。

matplotlib python