问题描述
我有一张这样的桌子:
treatment individual phase dist_mean track
1 control 1 pre 13.33 569.99
2 control 1 post 10.95 624.65
3 control 2 pre 9.93 363.35
4 control 2 post 10.11 339.88
5 control 3 pre 12.00 676.42
6 control 3 post 12.80 939.15
原则上,两行总是成对的。我需要从样本的前相中减去后相的 dist_mean
。简单的方法是从 1 中减去第 2 行,依此类推。但考虑到这个顺序在任何时候都有可能被打乱,整个计算都会出错。这就是为什么我想在两个阶段的治疗和个体匹配的条件下进行计算。 信息:治疗变化。并不总是control
。
解决方法
data.table
选项
setDT(df)[
order(treatment,individual,phase)
][,setNames(lapply(.SD,diff),paste0("diff_",names(.SD))),by = .(treatment,individual),.SDcols = c("dist_mean","track")
]
给予
treatment individual diff_dist_mean diff_track
1: control 1 2.38 -54.66
2: control 2 -0.18 23.47
3: control 3 -0.80 -262.73
使用 reshape
的基本 R 选项
transform(
reshape(
df,direction = "wide",idvar = c("treatment","individual"),timevar = "phase"
),diff_dist_mean = dist_mean.pre - dist_mean.post,diff_track = track.pre - track.post
)
给予
treatment individual dist_mean.pre track.pre dist_mean.post track.post
1 control 1 13.33 569.99 10.95 624.65
3 control 2 9.93 363.35 10.11 339.88
5 control 3 12.00 676.42 12.80 939.15
diff_dist_mean diff_track
1 2.38 -54.66
3 -0.18 23.47
5 -0.80 -262.73
,
使用aggregate
:
aggregate(dist_mean ~ treatment + individual,df1,function(x) diff(rev(x)))
# treatment individual dist_mean
#1 control 1 2.38
#2 control 2 -0.18
#3 control 3 -0.80
数据
df1 <- read.table(text = "
treatment individual phase dist_mean track
1 control 1 pre 13.33 569.99
2 control 1 post 10.95 624.65
3 control 2 pre 9.93 363.35
4 control 2 post 10.11 339.88
5 control 3 pre 12.00 676.42
6 control 3 post 12.80 939.15
",header = TRUE)
,
df <- read.table(text = " treatment individual phase dist_mean track
1 control 1 pre 13.33 569.99
2 control 1 post 10.95 624.65
3 control 2 pre 9.93 363.35
4 control 2 post 10.11 339.88
5 control 3 pre 12.00 676.42
6 control 3 post 12.80 939.15",header = T)
library(tidyverse)
df %>%
pivot_wider(c(treatment,names_from = phase,values_from = dist_mean) %>%
mutate(d = post - pre)
#> # A tibble: 3 x 5
#> treatment individual pre post d
#> <chr> <int> <dbl> <dbl> <dbl>
#> 1 control 1 13.3 11.0 -2.38
#> 2 control 2 9.93 10.1 0.180
#> 3 control 3 12 12.8 0.8
由 reprex package (v1.0.0) 于 2021 年 3 月 9 日创建
data.table
df <- read.table(text = " treatment individual phase dist_mean track
1 control 1 pre 13.33 569.99
2 control 1 post 10.95 624.65
3 control 2 pre 9.93 363.35
4 control 2 post 10.11 339.88
5 control 3 pre 12.00 676.42
6 control 3 post 12.80 939.15",header = T)
library(data.table)
setDT(df)
res <- dcast(data = df,formula = treatment + individual ~ phase,value.var = "dist_mean")[,d := post - pre]
head(res)
#> treatment individual post pre d
#> 1: control 1 10.95 13.33 -2.38
#> 2: control 2 10.11 9.93 0.18
#> 3: control 3 12.80 12.00 0.80
由 reprex package (v1.0.0) 于 2021 年 3 月 9 日创建
,使用data.table,重塑long-to-wide,然后得到post/pre 列的差异:
library(data.table)
setDT(df1)
dcast(df1,treatment + individual ~ phase,value.var = c("dist_mean","track")
)[,.(treatment,diff_dist_mean = dist_mean_post - dist_mean_pre,diff_track = track_post - track_pre)]
# treatment individual diff_dist_mean diff_track
# 1: control 1 -2.38 54.66
# 2: control 2 0.18 -23.47
# 3: control 3 0.80 262.73