表 1 包 R 中的方差分析 P 值列

问题描述

我正在尝试对数据集执行方差分析,以使用 table1 包比较表中不同组的均值。在页面底部this example 中,作者执行 t 检验以将 2 个均值(男性与女性)与我在代码中粘贴的函数进行比较。

我想做同样的事情,但有多种方式,如下面我的示例数据集所示。我想要所有年龄组的列和一个方差分析 p 值列。

我没有找到解决方案,所以如果有人可以提供帮助,我将非常感谢!


library(tidyverse)
library(table1)

# Function to compute t-test
pvalue <- function(x,...) {
  # Construct vectors of data y,and groups (strata) g
  y <- unlist(x)
  g <- factor(rep(1:length(x),times=sapply(x,length)))
  if (is.numeric(y)) {
    # For numeric variables,perform a standard 2-sample t-test
    p <- t.test(y ~ g)$p.value
  } else {
    # For categorical variables,perform a chi-squared test of independence
    p <- chisq.test(table(y,g))$p.value
  }
  # Format the p-value,using an HTML entity for the less-than sign.
  # The initial empty string places the output on the line below the variable label.
  c("",sub("<","&lt;",format.pval(p,digits=3,eps=0.001)))
}

# Fake dataset
age_group = factor(c("10-20","20-30","30-40","40-50","10-20","30-40"),levels = c("10-20","40-50"))
protein = c(25.3,87.5,35.1,50.8,50.4,61.5,76.7,56.1,59.2,40.2)
fat = c(76,45,74,34,55,100,94,81,23,45)
gender = c("female","male","female","female")
mydata <- tibble(gender,age_group,protein,fat)

解决方法

编辑:我解决了这个问题,其实很简单。如果有人正在寻找相同的功能,这里是该功能的新版本:

pvalueANOVA <- function(x,...) {
  # Construct vectors of data y,and groups (strata) g
  y <- unlist(x)
  g <- factor(rep(1:length(x),times=sapply(x,length)))
  
  if (is.numeric(y)) {
    # For numeric variables,perform a standard 2-sample t-test
    ano <- aov(y ~ g)
    p <- summary(ano)[[1]][[5]][1]
    
  } else {
    # For categorical variables,perform a chi-squared test of independence
    p <- chisq.test(table(y,g))$p.value
  }
  # Format the p-value,using an HTML entity for the less-than sign.
  # The initial empty string places the output on the line below the variable label.
  c("",sub("<","&lt;",format.pval(p,digits=3,eps=0.001)))
}