问题描述
我想将我计算的 HDI(高密度区间)(下面 df 中的 hdi_both
、hdi_one
和 lower_upper
列)添加到条形图中。
但是,我无法弄清楚如何添加误差线/CI,以便每个误差线都具有独立于 y 值的自定义上限和下限(在本例中为 proportion_correct
中的相应值)。
例如,Exp 的 HDI 间隔。具有 guesses_correct
both
的 1 的下限为 0.000000
,上限为 0.130435
,proportion_correct
为 0.000000
。
我看到的所有选项都包括指定相对于 y 轴上的值的上限和下限,这不是我想要的。
任何帮助将不胜感激。
谢谢,
阿亚拉
import os
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
df = pd.DataFrame({
'exp': ['Exp. 1','Exp. 1','Exp. 2','Exp. 3','Exp. 4','Exp. 5','Collapsed','Collapsed'],'proportion_correct': [0.0,0.304347826,0.058823529000000006,0.31372549,0.047619048,0.333333333,0.12244898,0.428571429,0.367346939,0.082901554,0.35751295299999997],'guesses_correct': ['both','one','both','one'],'hdi_both': [0.0,0.130434783,0.0,0.078431373,0.1,0.08,0.081632653,0.005181347,0.051813472],'hdi_one': [0.130434783,0.47826087,0.156862745,0.41176470600000004,0.5,0.16,0.4,0.163265306,0.408163265,0.21761658,0.341968912],'lower_upper': ['lower','upper','lower','upper']
})
print(df.head())
Out[4]:
exp proportion_correct guesses_correct hdi_both hdi_one lower_upper
0 Exp. 1 0.000000 both 0.000000 0.130435 lower
1 Exp. 1 0.304348 one 0.130435 0.478261 upper
2 Exp. 2 0.058824 both 0.000000 0.156863 lower
3 Exp. 2 0.313725 one 0.078431 0.411765 upper
4 Exp. 3 0.047619 both 0.000000 0.100000 lower
# Make bar plot
sns.barplot(x='exp',y='proportion_correct',hue='guesses_correct',data=df)
plt.ylim([0,0.5])
plt.xlabel('Experiment')
plt.ylabel('Proportion Correct')
plt.legend(title='Correct guesses',loc='upper right')
plt.axhline(y=0.277777,color='dimgray',linestyle='--')
plt.annotate(' chance\n one',(5.5,0.27))
plt.axhline(y=0.02777,linestyle='--')
plt.annotate(' chance\n both',0.02))
# Show the plot
plt.show()
这是我要为其添加 HDI 的条形图
解决方法
尽管您已经以绝对值计算了误差条的下限和上限,但它们通常被认为是围绕特定 y 值的下限和上限误差。但是,通过从您计算的边界中减去 y 值,很容易计算出误差线的“相对”长度。
然后您可以使用 plt.errorbar()
进行绘图。请注意,要使用此函数,所有错误值必须为正。
由于您使用的是 hue=
拆分,因此您必须遍历 hue
的不同级别,并考虑条形的移动(默认情况下,两个级别的 -0.2 和 +0.2色调):
# Make bar plot
x_col = 'exp'
y_col = 'proportion_correct'
hue_col = 'guesses_correct'
low_col = 'hdi_both'
high_col = 'hdi_one'
sns.barplot(x=x_col,y=y_col,hue=hue_col,data=df)
for (h,g),pos in zip(df.groupby(hue_col),[-0.2,0.2]):
err = g[[low_col,high_col]].subtract(g[y_col],axis=0).abs().T.values
x = np.arange(len(g[x_col].unique()))+pos
plt.errorbar(x=x,y=g[y_col],yerr=err,fmt='none',capsize=5,ecolor='k')
,
我最终将垂直线绘制为误差线。这是我的代码,以防它对某人有所帮助。
df = pd.DataFrame({'exp': ['Exp. 1','Exp. 1','Exp. 2','Exp. 3','Exp. 4','Exp. 5','Collapsed','Collapsed'],'proportion_correct': [0.0,0.304347826,0.058823529000000006,0.31372549,0.047619048,0.333333333,0.12244898,0.428571429,0.367346939,0.082901554,0.35751295299999997],'guesses_correct': ['both','one','both','one'],'hdi_low': [0.0,0.130434783,0.0,0.156862745,0.1,0.16,0.163265306,0.005181347,0.21761658],'hdi_high': [0.130434783,0.47826087,0.078431373,0.41176470600000004,0.5,0.08,0.4,0.081632653,0.408163265,0.051813472,0.341968912]
})
df.head()
Out[4]:
exp proportion_correct guesses_correct hdi_low hdi_high
0 Exp. 1 0.000000 both 0.000000 0.130435
1 Exp. 1 0.304348 one 0.130435 0.478261
2 Exp. 2 0.058824 both 0.000000 0.078431
3 Exp. 2 0.313725 one 0.156863 0.411765
4 Exp. 3 0.047619 both 0.000000 0.100000
以下 axvlines
和 axhlines
函数取自 How to draw vertical lines on a given plot in matplotlib。为清楚起见,我不会将它们写在这里。
# Make bar plot
x_col = 'exp'
y_col = 'proportion_correct'
hue_col = 'guesses_correct'
low_col = 'hdi_low'
high_col = 'hdi_high'
plot = sns.barplot(x=x_col,data=df)
plt.ylim([0,0.55])
plt.yticks([0,0.2,0.3,0.5],[0,0.5])
plt.xlabel('Experiment')
plt.ylabel('Proportion Correct')
plt.legend(title='Correct guesses',loc='upper right')
plt.axhline(y=0.277777,color='dimgray',linestyle='--')
plt.annotate(' chance\n one',(5.65,0.27))
plt.axhline(y=0.02777,linestyle='--')
plt.annotate(' chance\n both',0.02))
lims_x = list(map(lambda x,y: (x,y),df[low_col].to_list(),df[high_col].to_list()))
xss = [-0.2,0.8,1.2,1.8,2.2,2.8,3.2,3.8,4.2,4.8,5.2]
yss = [i for sub in lims_x for i in sub]
lims_y = [(-0.3,-0.1),(-0.3,(0.1,0.3),(0.7,0.9),(1.1,1.3),(1.7,1.9),(2.1,2.3),(2.7,2.9),(3.1,3.3),(3.7,3.9),(4.1,4.3),(4.7,4.9),(5.1,5.3),5.3)]
for xs,lim in zip(xss,lims_x):
plot = axvlines(xs,lims=lim,color='black')
for yx,lim in zip(yss,lims_y):
plot = axhlines(yx,color='black')
plt.show()