问题描述
我的 Tukey 检验显着结果 LABELS 和绘制为箱线图的颜色不会绘制在每个样本箱线图上。为什么? 似乎标签是沿着相同的 s1(x 轴)绘制在不同的 y 轴上的?
此处可重现的数据集:
library(multcompView)
df <- data.frame('Sample'=c("s1","s1","s2","s3","s4","s5","s5"),'value'=c(-0.1098,-0.1435,-0.1046,-0.1308,-0.1523,-0.1219,-0.1114,-0.1328,-0.1589,-0.1567,-0.1395,-0.1181,-0.1448,-0.124,-0.1929,-0.1996,-0.1981,-0.1917))
anova_df <- aov(df$value ~ df$Sample )
tukey_df <- TukeyHSD(anova_df,'df$Sample',conf.level=0.95)
# I need to group the treatments that are not different each other together.
TUKEY <- tukey_df
generate_label_df <- function(TUKEY,variable){
# Extract labels and factor levels from Tukey post-hoc
Tukey.levels <- TUKEY[[variable]][,4]
Tukey.labels <- data.frame(multcompLetters(Tukey.levels)['Letters'])
#I need to put the labels in the same order as in the boxplot :
Tukey.labels$Sample=rownames(Tukey.labels)
Tukey.labels=Tukey.labels[order(Tukey.labels$Sample),]
return(Tukey.labels)
}
# Apply the function on my dataset
LABELS <- generate_label_df(TUKEY,"df$Sample")
# A panel of colors to draw each group with the same color :
my_colors <- c(
rgb(143,199,74,maxColorValue = 255),rgb(242,104,34,rgb(111,145,202,maxColorValue = 255))
# Draw the basic boxplot
a <- boxplot(df$value ~ df$Sample,ylim=c(min(df$value),1.1*max(df$value)),col=my_colors[as.numeric(LABELS[,1])],ylab="Value",main="")
# I want to write the letter over each box. Over is how high I want to write it.
over <- 0.1*max(a$stats[nrow(a$stats),] )
#Add the labels
text(c(1:nlevels(df$Sample)),a$stats[nrow(a$stats),]+over,LABELS[,1],1])] )
解决方法
首先,LABELS$Letters
是一个字符向量。如果你先把它作为一个因素,你可以让 as.numeric(LABELS[,1])
工作。
其次,您的 y 限制需要对负值进行一些工作。您可能会发现一个名为 extendrange
的函数,它在许多绘图函数中都有使用。
如果 c(1:nlevels(df$Sample))
是一个不是的因素,这行 df$Sample
也可以工作。
此外,如果您在特定位置绘制 text
,您可以使用 text(...,pos = )
或 text(...,adj = )
调整文本以移动位置。
LABELS$Letters <- factor(LABELS$Letters)
a <- boxplot(df$value ~ df$Sample,ylim = extendrange(df$value),col=my_colors[as.numeric(LABELS[,1])],ylab="Value",main="")
text(seq_along(a$names),apply(a$stats,2,max),LABELS[,1],pos = 3)
,
如果您不介意改变工作流程并使用 tidyverse
库,那么您可以通过以下方式实现目标:
# join df and LABELS into one data table
inner_join(df,LABELS,by = "Sample") %>%
# calculate max value for each Sample group (it will be used to place the labels)
group_by(Sample) %>%
mutate(placement = max(value)) %>%
ungroup() %>%
# make a plot
ggplot(aes(Sample,value,fill = Letters))+
geom_boxplot()+
geom_text(aes(y = placement,label = Letters,col = Letters),nudge_y = 0.01,size = 6)+
theme_minimal()+
theme(legend.position = "none")