将 GTSummary 中的行分组

问题描述

我正在尝试对一些行/变量(分类的和连续的)进行分组,以帮助提高大型数据集中的表格可读性。

这是虚拟数据集:

library(gtsummary)
library(tidyverse)
library(gt)
set.seed(11012021)

# Create Dataset
PIR <- 
  tibble(
    siteidn = sample(c("1324","1329","1333","1334"),5000,replace = TRUE,prob = c(0.2,0.45,0.15,0.2)) %>% factor(),countryname = sample(c("NZ","Australia"),prob = c(0.3,0.7)) %>% factor(),hospt = sample(c("Metropolitan","Rural"),prob = c(0.65,0.35)) %>% factor(),age = rnorm(5000,mean = 60,sd = 20),apache2 = rnorm(5000,mean = 18.5,sd=10),apache3 = rnorm(5000,mean = 55,sd=20),mechvent = sample(c("Yes","No"),prob = c(0.4,0.6)) %>% factor(),sex = sample(c("Female","Male"),replace = TRUE) %>% factor(),patient = TRUE
  ) %>%
  mutate(patient_id = row_number())%>% 
  group_by(
    siteidn) %>% mutate(
      count_site = row_number() == 1L) %>%
  ungroup()%>% 
  group_by(
    patient_id) %>% mutate(
      count_pt = row_number() == 1L) %>%
  ungroup()

然后我使用以下代码生成我的表:

t1 <- PIR %>% 
  select(patientn = count_pt,siten = count_site,age,sex,apache2,apache3,mechvent,countryname) %>% 
  tbl_summary(
    by = countryname,missing = "no",statistic = list(
      patientn ~ "{n}",siten ~ "{n}",age ~ "{mean} ({sd})",apache2 ~ "{mean} ({sd})",mechvent ~ "{n} ({p}%)",sex ~ "{n} ({p}%)",apache3 ~ "{mean} ({sd})"),label = list(
      siten = "Number of ICUs",patientn = "Number of Patients",age = "Age",apache2 = "APACHE II score",mechvent = "Mechanical Ventilation",sex = "Sex",apache3 = "APACHE III score")) %>% 
  modify_header(stat_by = "**{level}**") %>%
  add_overall(col_label = "**Overall**")
  
t2 <- PIR %>% 
  select(patientn = count_pt,hospt) %>% 
  tbl_summary(
    by = hospt,apache3 = "APACHE III score")) %>%  
  modify_header(stat_by = "**{level}**")

tbl <-
  tbl_merge(
    tbls = list(t1,t2),tab_spanner = c("**Country**","**Hospital Type**")
  ) %>%
  modify_spanning_header(stat_0_1 ~ NA) %>%
  modify_footnote(everything() ~ NA)

这会产生下表:

Table 1

为了便于阅读,我想将某些行组合在一起。理想情况下,我希望表格看起来像这样:

Table 1 Ideal

我尝试使用 gt 包,代码如下:

tbl <-
  tbl_merge(
    tbls = list(t1,"**Hospital Type**")
  ) %>%
  modify_spanning_header(stat_0_1 ~ NA) %>%
  modify_footnote(everything() ~ NA) %>% 
as_gt() %>%  
  gt::tab_row_group(
    group = "Severity of Illness scores",rows = 7:8) %>%  
  gt::tab_row_group(
    group = "Patient Demographics",rows = 3:6) %>%  
  gt::tab_row_group(
    group = "Numbers",rows = 1:2)

这将生成所需的表:

Table 1 Sections

我的做法有几个问题。

  1. 当我尝试使用行名称(变量)时,会出现一条错误消息(不能对不存在的列进行子集化...)。有没有办法通过使用变量名来做到这一点?对于较大的表,我在使用行号分配行名称方法时遇到了一些麻烦。当有单个变量失去其位置时尤其如此,因为它被移到末尾以解释分组行。

  2. 有没有办法在管道进入 tbl_summary 之前做到这一点?虽然我喜欢这个表格的输出,但我使用 Word 作为我的统计报告输出文档,并且希望能够在需要时(或由我的合作者)在 Word 中格式化表格。我通常使用 gtsummary::as_flextable 进行表格输出

再次感谢,

解决方法

暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!

如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@)