用ggplot2在酒吧上添加症状表?

问题描述

我想在ggplot2,R中在与每种颜色相对应的每个条形上添加我的症状名称。我希望每个条形上的名称与颜色对齐,并针对每个症状词的大小输入一个参数必须通过,但不知道如何。

我尝试过但没有成功。这就是我尝试过的样子:

enter image description here

可以在此链接中找到虚假数据:

https://github.com/gabrielburcea/stackoverflow_fake_data/blob/master/labels_symptoms_ontop_of_bar_data.csv

这是我尝试过的代码

plot_adjusted_rates <- ggplot2::ggplot(fake_data,ggplot2::aes(country,value)) +
  ggplot2::coord_flip() +
  ggplot2::geom_bar(ggplot2::aes(fill = symptoms),width = 0.4,position = position_dodge(width = 0.5),stat = "identity") +
  geom_text(aes(label = symptoms),nudge_x = c(0.22,-0.22)) + 
  jcolors::scale_color_jcolors(palette = "pal12" )+ 
  ggplot2::labs(title = title,subtitle = "\nNote: Adjusted rates for symptoms in responders tested  in %",x = "Symptoms in /Countries",y = "Percentage",caption = "Source: Your.md Data") +
  ggplot2::theme(axis.title.y = ggplot2::element_text(margin = ggplot2::margin(t = 0,r = 21,b = 0,l = 0)),plot.title = ggplot2::element_text(size = 12,face = "bold"),plot.subtitle = ggplot2::element_text(size = 10),legend.position = "bottom",legend.Box = "horizontal") +
  ggplot2::theme_bw()


plot_adjusted_rates

解决方法

这是一个可以解决您问题的通用代码块:

# load librariy
library(dplyr)
library(ggplot2)

# load data
data_url = 'https://raw.githubusercontent.com/gabrielburcea/stackoverflow_fake_data/master/labels_symptoms_ontop_of_bar_data.csv'
fake_data = read.csv(data_url)

# plot
plot = ggplot(fake_data,aes(x = country,y = value,fill = symptoms)) +
  geom_bar(stat = "identity",show.legend = FALSE,width = 0.4,position = position_dodge(width = 0.5)) +
  coord_flip() + 
  geom_text(aes(label = symptoms),size = 3,hjust = -0.05,position = position_dodge2(width = 0.5))

输出:

enter image description here

但是,如您所见,geom_text()中的标签是重叠的。发生这种情况是因为同一个国家/地区和症状在您的假数据中具有多个价值。国家(A)显示一个特定的“问题”,因为相同的症状具有非常不同的值(24.05和44.05)。这让我想到了是否应该清理数据……看一下:

dplyr::filter(fake_data,symptoms == 'loss_appetite')

输出:

   country      symptoms    value
1        A loss_appetite 24.05464
2        A loss_appetite 24.05464
3        A loss_appetite 24.05464  <- 24.05
4        A loss_appetite 44.05464  <- 44.05
5        B loss_appetite 31.25430
6        B loss_appetite 31.25430
7        B loss_appetite 31.25430
8        B loss_appetite 31.25430
9        C loss_appetite 32.44539
10       C loss_appetite 32.44539
11       C loss_appetite 32.44539
12       C loss_appetite 32.44539
13       D loss_appetite 36.52090
14       D loss_appetite 36.52090
15       D loss_appetite 36.52090
16       D loss_appetite 36.52090
17       E loss_appetite 20.65789

为避免重叠的文本问题以及指出的数据“错误”,可以使用groupy()summarise(),以便仅对每个组和位置的最大值进行排序每个栏顶部的文本。

因此,此代码应根据需要绘制数据:

# create label data
fake_text = fake_data %>%
  group_by(country,symptoms) %>%
  summarize(max = max(value))

# plot
ggplot(fake_data,position = position_dodge2(width = 0.5))

输出:

enter image description here