问题描述
year <- c(2000,2000,2001,2001)
gender <- c("F","M","F","M")
grade <- c("A","B","C","A")
df <- data.frame(year,gender,grade)
我想制作一个汇总表,但尽量减少手动代码并尽可能使流程自动化。在我的项目中,我要总结 170 个变量。 我尝试了 tidyverse group by 但没有得到我想要的结果。 我将使用 xtable 移动到乳胶文件。 (我尝试了 add.to.row 但未能在第一行添加“性别”。)
这是我想要的结果。
请帮我画这张桌子。我需要表中的变量名称。
解决方法
您可以使用 pivot_longer
和 summarise
生成汇总值。
library(tidyverse)
df %>%
pivot_longer(-year) %>%
group_by(year,name,value) %>%
summarise(n = n()) %>%
mutate(prop = round(n / sum(n),3) * 100)
# A tibble: 10 x 5
# Groups: year,name [4]
year name value n prop
<dbl> <chr> <chr> <int> <dbl>
1 2000 gender F 1 33.3
2 2000 gender M 2 66.7
3 2000 grade A 1 33.3
4 2000 grade B 1 33.3
5 2000 grade C 1 33.3
6 2001 gender F 2 66.7
7 2001 gender M 1 33.3
8 2001 grade A 1 33.3
9 2001 grade B 1 33.3
10 2001 grade C 1 33.3
您还可以通过将值连接到格式化字符串中,然后使用 pivot_wider
:
df %>%
pivot_longer(-year) %>%
group_by(year,3) * 100,summary_str = glue::glue("{n}({prop}%)")) %>%
pivot_wider(id_cols = c(name,value),names_from = "year",values_from = "summary_str")
name value `2000` `2001`
<chr> <chr> <glue> <glue>
1 gender F 1(33.3%) 2(66.7%)
2 gender M 2(66.7%) 1(33.3%)
3 grade A 1(33.3%) 1(33.3%)
4 grade B 1(33.3%) 1(33.3%)
5 grade C 1(33.3%) 1(33.3%)
,
我在评论中提到您可以在 tables
包中执行此操作。举个例子:
year <- c(2000,2000,2001,2001)
gender <- c("F","M","F","M")
grade <- c("A","B","C","A")
# Our table treats the columns as factors,so save them that way
# I'll change the names to the way we'd like them to appear.
df <- data.frame(Year = factor(year),Gender = factor(gender),Grade = factor(grade))
library(tables)
# write a small function to format the percent values the way you want.
fmtPercent <- function(x,digits = 1) paste0("(",format(x,digits = digits),"\\%)")
# Calculate the table object.
tab <- tabular(Gender + Grade ~ Year * Heading()*(1 + Percent("col")*Format(fmtPercent())),data = df)
# Print it as text.
tab
#>
#> Year
#> 2000 2001
#> Gender F 1 (33\\%) 2 (67\\%)
#> M 2 (67\\%) 1 (33\\%)
#> Grade A 1 (33\\%) 1 (33\\%)
#> B 1 (33\\%) 1 (33\\%)
#> C 1 (33\\%) 1 (33\\%)
由 reprex package (v2.0.0) 于 2021 年 7 月 31 日创建
我在百分号之前添加转义符的原因是它可以在 LaTeX 中正确打印。在 R Markdown 文档的 PDF 输出中,它看起来像这样: