按年份分类变量的 R 到 Latex 汇总表

问题描述

year <- c(2000,2000,2001,2001)
gender <- c("F","M","F","M")
grade <- c("A","B","C","A")
df <- data.frame(year,gender,grade)

我想制作一个汇总表,但尽量减少手动代码并尽可能使流程自动化。在我的项目中,我要总结 170 个变量。 我尝试了 tidyverse group by 但没有得到我想要的结果。 我将使用 xtable 移动到乳胶文件。 (我尝试了 add.to.row 但未能在第一行添加“性别”。)

这是我想要的结果。

enter image description here

请帮我画这张桌子。我需要表中的变量名称

解决方法

您可以使用 pivot_longersummarise 生成汇总值。

library(tidyverse)

df %>% 
  pivot_longer(-year) %>% 
  group_by(year,name,value) %>% 
  summarise(n = n()) %>% 
  mutate(prop = round(n / sum(n),3) * 100)

# A tibble: 10 x 5
# Groups:   year,name [4]
    year name   value     n  prop
   <dbl> <chr>  <chr> <int> <dbl>
 1  2000 gender F         1  33.3
 2  2000 gender M         2  66.7
 3  2000 grade  A         1  33.3
 4  2000 grade  B         1  33.3
 5  2000 grade  C         1  33.3
 6  2001 gender F         2  66.7
 7  2001 gender M         1  33.3
 8  2001 grade  A         1  33.3
 9  2001 grade  B         1  33.3
10  2001 grade  C         1  33.3

您还可以通过将值连接到格式化字符串中,然后使用 pivot_wider

df %>% 
  pivot_longer(-year) %>% 
  group_by(year,3) * 100,summary_str = glue::glue("{n}({prop}%)")) %>% 
  pivot_wider(id_cols = c(name,value),names_from = "year",values_from = "summary_str") 

  name   value `2000`   `2001`  
  <chr>  <chr> <glue>   <glue>  
1 gender F     1(33.3%) 2(66.7%)
2 gender M     2(66.7%) 1(33.3%)
3 grade  A     1(33.3%) 1(33.3%)
4 grade  B     1(33.3%) 1(33.3%)
5 grade  C     1(33.3%) 1(33.3%)
,

我在评论中提到您可以在 tables 包中执行此操作。举个例子:

year <- c(2000,2000,2001,2001)
gender <- c("F","M","F","M")
grade <- c("A","B","C","A")

# Our table treats the columns as factors,so save them that way
# I'll change the names to the way we'd like them to appear.

df <- data.frame(Year = factor(year),Gender = factor(gender),Grade = factor(grade))

library(tables)
# write a small function to format the percent values the way you want.
fmtPercent <- function(x,digits = 1) paste0("(",format(x,digits = digits),"\\%)")

# Calculate the table object.
tab <- tabular(Gender + Grade ~ Year * Heading()*(1 + Percent("col")*Format(fmtPercent())),data = df)

# Print it as text.
tab
#>                                    
#>           Year                     
#>           2000         2001        
#>  Gender F 1    (33\\%) 2    (67\\%)
#>         M 2    (67\\%) 1    (33\\%)
#>  Grade  A 1    (33\\%) 1    (33\\%)
#>         B 1    (33\\%) 1    (33\\%)
#>         C 1    (33\\%) 1    (33\\%)

reprex package (v2.0.0) 于 2021 年 7 月 31 日创建

我在百分号之前添加转义符的原因是它可以在 LaTeX 中正确打印。在 R Markdown 文档的 PDF 输出中,它看起来像这样:

enter image description here