问题描述
我从类似这样的SQL查询中得到了一个结果(没有特殊含义的测试查询):
week cash ccard fcard mobile total
9 3.45 0.00 0.00 0.00 3.45
10 13.02 17.18 4.32 21.24 55.76
11 47.61 24.52 12.32 32.18 116.63
12 21.32 61.96 17.32 1.40 102.00
13 181.80 1.70 275.20 3.50 462.20
14 390.14 191.80 10.08 100.40 692.42
15 102.40 207.80 101.40 0.00 411.60
此查询的结果进入一个数据框,我想将其绘制为堆积密度图,其中“周”应位于X轴上,而Y轴应为分数“现金/总计”,“卡数/总计”等等。我该怎么办?我用谷歌搜索,但到目前为止发现的所有示例似乎都不适用于sql输出。
预先感谢...
解决方法
通常,ggplot2
倾向于使用“长”格式的数据,而当前它是“宽”格式的数据。用SQL术语来说,这是一个PIVOT
,尽管我发现使用tidyr::pivot_*
和data.table::melt
和::dcast
比在SQL中更容易使用。
通过这种方式,我的意思是:
library(dplyr)
library(tidyr) # just for pivot_longer
dat <- pivot_longer(dat,cash:mobile) %>%
mutate(pct = (value / total))
dat
# # A tibble: 28 x 5
# week total name value pct
# <int> <dbl> <chr> <dbl> <dbl>
# 1 9 3.45 cash 3.45 1
# 2 9 3.45 ccard 0 0
# 3 9 3.45 fcard 0 0
# 4 9 3.45 mobile 0 0
# 5 10 55.8 cash 13.0 0.234
# 6 10 55.8 ccard 17.2 0.308
# 7 10 55.8 fcard 4.32 0.0775
# 8 10 55.8 mobile 21.2 0.381
# 9 11 117. cash 47.6 0.408
# 10 11 117. ccard 24.5 0.210
# # ... with 18 more rows
有了它,你可以做
library(ggplot2)
# library(scales) # percent
ggplot(dat,aes(week,pct,fill=name)) +
geom_density(position="fill",stat="identity") +
scale_y_continuous(labels = scales::percent)
(我还要补充一点,此图的“密度”性质有点欺骗:建议在每周点之间添加数据。由于x轴实际上是离散的,因此n较低,因此d就像@RyanJohn所建议的那样建议一个小程序。)
,这是条形图-如果您愿意的话。
library(tidyverse)
library(scales)
df1 <- structure(list(week = c(9,10,11,12,13,14,15),cash = c(3.45,13.02,47.61,21.32,181.8,390.14,102.4),ccard = c(0,17.18,24.52,61.96,1.7,191.8,207.8),fcard = c(0,4.32,12.32,17.32,275.2,10.08,101.4),mobile = c(0,21.24,32.18,1.4,3.5,100.4,0),total = c(3.45,55.76,116.63,102,462.2,692.42,411.6)),class = c("spec_tbl_df","tbl_df","tbl","data.frame"),row.names = c(NA,-7L),spec = structure(list(
cols = list(week = structure(list(),class = c("collector_double","collector")),cash = structure(list(),ccard = structure(list(),fcard = structure(list(),mobile = structure(list(),total = structure(list(),"collector"))),default = structure(list(),class = c("collector_guess",skip = 1),class = "col_spec"))
df1 %>%
pivot_longer(cols = c(-week,-total),names_to = "type",values_to = "amount") %>%
mutate(pct = amount / total) %>%
ggplot(aes(week,fill = type))+
geom_col() +
scale_y_continuous(labels = scales::percent_format())+
labs(title = "% spend by payment type")
由reprex package(v0.3.0)于2020-08-12创建