问题描述
我发现使用 ggplot2 为条形图添加标签非常困难。我正在使用泰坦尼克号数据集,并且为了添加标签而不得不创建额外的数据框 - 整个过程非常艰巨,让我发疯。
这是基本代码和图表的样子:
titanic %>% ggplot(aes(x=Sex,fill=Survived))+
geom_bar() +
scale_fill_discrete(name=NULL,labels=c("Dead","Survived")) +
labs(y="Number of Passengers",title="Titanic Survival Rates by Sex")
如您所见,条形图上没有标签。因为美学映射中没有“y”变量,所以 geom_text(aes(label= xxx))
层不起作用。此外,如果没有“y”变量,geom_bar(stat="identity")
将不起作用。这是我为解决这个问题所做的:
# Create a data frame from a two-way table including Survived and Sex
>table(titanic$Survived,titanic$Sex)
female male
0 81 468
1 233 109
rates_by_sex<-data.frame(Sex=c("Female","Male"),Dead=c(81,468),Survived=c(233,109))
# Convert data frame to long format
>rates_by_sex_long <- melt(rates_by_sex,id="Sex")
Sex variable value
1 Female Dead 81
2 Male Dead 468
3 Female Survived 233
4 Male Survived 109
ggplot2 现在可以使用 geom_text()
和 aes(label=value)
rates_by_sex_long %>% ggplot(aes(x=Sex,y=value,fill=variable)) +
geom_bar(stat="identity") +
geom_text(aes(label=value),position = position_stack(vjust=0.5),colour = "white",size = 5) +
scale_fill_discrete(name=NULL) +
labs(y="Number of Passengers",title="Titanic Survival Rates by Sex")
# Manually create a data frame with the rate of survival.
table(titanic$Survived) # Gives raw counts of each category
100*round(prop.table(table(titanic$Survived)),4) # Survival rate in percentages
titanic_survival_rate<-data.frame(Survived=c("Yes","No"),Number=c(342,549),Percent=c(38.38,61.62))
titanic_survival_rate %>% ggplot(aes(x=Survived,y=Number)) +
geom_bar(stat="identity",fill="steelblue",colour = "black") +
geom_text(aes(label=paste0(Percent,"%")),nudge_y=25,colour = "black",size = 4) +
labs(y="Number of Passengers",title="Titanic Survival Rate")
这样做效率极低。要制作的图表太多了,分别为每个图表构建数据框将是不切实际和不可能的。我什至不知道面对时我会做什么。
问题:如何使用分类变量获取条形图的标签(计数和百分比)?我知道它可以通过一些额外的编码来完成(即,向 geom_text()
添加一些东西),但我不太明白。
请随意使用这个可复制的代码:
df<-data.frame(survived=c(1,1,0),sex=c("M","F","M","M"))
df$survived<-as.factor(df$survived)
df %>% ggplot(aes(x=sex,fill=survived))+geom_bar()+geom_text(aes(label=???))