问题描述
我从阿尔茨海默氏病患者队列中获得了这些数据。我想创建一个汇总表(或列联表)以显示该表中的所有信息。这就是我希望在这个队列中看到的:男性和女性多少,平均发病年龄,上次访视的平均年龄,死亡的平均年龄,载脂蛋白4any的样本数(IID)。在R中创建此类表格的方法应该是什么?
dat <- structure(list(IID = structure(1:10,.Names = c("1","2","3","4","5","6","7","8","9","10"),.Label = c("NACC000875","NACC003779","NACC006805","NACC008215","NACC010067","NACC010592","NACC011413","NACC015383","NACC017476","NACC017538"),class = "factor"),cohort = structure(c(`1` = 1L,`2` = 1L,`3` = 1L,`4` = 1L,`5` = 1L,`6` = 1L,`7` = 1L,`8` = 1L,`9` = 1L,`10` = 1L
),.Label = "ADC8_AA",sex = structure(c(`1` = 2L,`2` = 2L,`3` = 2L,`4` = 2L,`5` = 2L,`8` = 2L,`9` = 2L,`10` = 2L),.Label = c("1","2"),status = structure(c(`1` = 1L,`7` = 2L,`10` = 2L
),Race = structure(c(`1` = 1L,`10` = 1L),.Label = "2",Ethnicity = structure(c(`1` = 1L,.Label = "0",age_onset = structure(c(NA,NA,1L,4L,2L,3L),.Label = c(" 63"," 67"," 71"," 79","888"),age_last_visit = structure(c(`1` = 6L,`2` = 4L,`3` = 3L,`7` = 8L,`8` = 7L,`10` = 5L),.Label = c("70","71","74","77","78","82","86","89"),age_death = structure(c(NA,3L,NA),.Label = c(" 72"," 88"," 90",apoe4any = structure(c(`1` = 1L,.Label = c("0","1"),class = "factor")),row.names = c("1",class = "data.frame")
解决方法
R将factor
类用于分类数据。如果您将年龄(当前是因素)更改为numeric
,则summary(dat)
将为您提供大部分所需的信息。
convert_to_numeric = c("age_onset","age_last_visit","age_death")
dat[convert_to_numeric] = lapply(dat[convert_to_numeric],function(x) as.numeric(as.character(x)))
summary(dat)
# IID cohort sex status Race Ethnicity age_onset age_last_visit
# NACC000875:1 ADC8_AA:10 1:2 1:6 2:10 0:10 Min. :63 Min. :70.00
# NACC003779:1 2:8 2:4 1st Qu.:66 1st Qu.:70.25
# NACC006805:1 Median :69 Median :75.50
# NACC008215:1 Mean :70 Mean :76.70
# NACC010067:1 3rd Qu.:73 3rd Qu.:81.00
# NACC010592:1 Max. :79 Max. :89.00
# (Other) :4 NA's :6
# age_death apoe4any
# Min. :72.00 0:3
# 1st Qu.:80.00 1:7
# Median :88.00
# Mean :83.33
# 3rd Qu.:89.00
# Max. :90.00
# NA's :7
请参阅this common FAQ,了解我向数字转换的因素。
如果您只想汇总提到的列,则还可以对数据进行子集处理:
summary(dat[c("sex",convert_to_numeric,"apoe4any")])
# sex age_onset age_last_visit age_death apoe4any
# 1:2 Min. :63 Min. :70.00 Min. :72.00 0:3
# 2:8 1st Qu.:66 1st Qu.:70.25 1st Qu.:80.00 1:7
# Median :69 Median :75.50 Median :88.00
# Mean :70 Mean :76.70 Mean :83.33
# 3rd Qu.:73 3rd Qu.:81.00 3rd Qu.:89.00
# Max. :79 Max. :89.00 Max. :90.00
# NA's :6 NA's :7