问题描述
df <- data.frame(Language = factor(c(1,1,2,2),levels = 1:2,labels = c("GER","ENG")),Agegrp = factor(c(1,3,4,4),levels = c( 1,labels = c("10-19","20-29","30-39","40+"))
)
df %>% ggplot(aes(x = Agegrp,fill = Language)) +
geom_bar(position = 'dodge') +
labs(title = "Age-structure between German and English",y = "Number of persons")
使用上述示例数据,我可以创建以下图。但是
- 如何计算每种语言中每个年龄组的百分比(使用 dplyr)和
- 如何用百分比绘制相同的图(y 轴应该是百分比)?
在这个例子中,百分比很容易看出,因为两种语言都有相同的案例数 (10),但实际数据不一定是这种情况。感谢您的帮助!
解决方法
要计算 Agegrp
中每个 Language
的百分比,您可以尝试 -
library(dplyr)
library(ggplot2)
df %>%
count(Agegrp,Language) %>%
group_by(Language) %>%
mutate(n = prop.table(n)) %>%
ungroup %>%
ggplot(aes(x = Agegrp,y = n,fill = Language)) +
geom_col(position = 'dodge') +
scale_y_continuous(labels = scales::percent) +
labs(title = "Age-structure between German and English",y = "Percentage of persons")
,
如果您想在条形上添加百分比,您可以使用此代码。计算百分比的逻辑与 Ronak 相同(归功于 Ronak)
df %>%
count(Language,Agegrp) %>%
group_by(Language) %>%
mutate(percent = prop.table(n)) %>%
ggplot(aes(x = Agegrp,y = percent,fill = Language,label = scales::percent(percent))) +
geom_col(position = 'dodge') +
geom_text(position = position_dodge(width = .9),# move to center of bars
vjust = -0.5,# nudge above top of bar
size = 3) +
scale_y_continuous(labels = scales::percent) +
labs(title = "Age-structure between German and English",y = "Number of persons")