将重复循环输出分配给一个数据帧

问题描述

我有一个数据框火车。看起来像这样:

> str(train)
Classes ‘data.table’ and 'data.frame':  4096 obs. of  2 variables:
 $ XY   : chr  "-0.253056407416539,0.0284760501437887" "-0.248966337417195,0.0327728517259305" "-0.244876267417851,0.0376197657997918" "-0.240786197418507,0.0430736699343487" ...
 $ Group: chr  "fa05,1" "fa05,1" ...
 - attr(*,".internal.selfref")=<externalptr> 

> head(train)
                                      XY  Group
1: -0.253056407416539,0.0284760501437887 fa05,1
2: -0.248966337417195,0.0327728517259305 fa05,1
3: -0.244876267417851,0.0376197657997918 fa05,1
4: -0.240786197418507,0.0430736699343487 fa05,1
5: -0.236696127419163,0.0492435986076443 fa05,1
6: -0.232606057419819,0.0562149950068869 fa05,1

我编写了代码,按组对XY列进行重新采样,按“,”将列中的值分成两个单独的列,将它们转换为数字,然后为每个组分别取X和Y列的平均值。它工作完美,输出看起来像这样:

    Group.1         X         Y
1    fa05,0 0.3174567 1.1083954
2    fa05,1 0.2857464 1.0411072
3    fa10,0 0.2987560 1.1765904
4    fa10,1 0.2563579 1.1286934
5    fa20,0 0.3204026 1.0703147
6    fa20,1 0.2597907 1.1629019
7  flatfa,0 0.3191444 1.0399517
8  flatfa,1 0.2532680 1.1957248
9  flatsa,0 0.3252190 1.0506540
10 flatsa,1 0.3124151 0.8458343
11   sa05,0 0.2792419 1.1065144
12   sa05,1 0.2186174 1.2720533
13   sa10,0 0.3071584 1.3031327
14   sa10,1 0.3134321 1.0493272
15   sa20,0 0.3134320 1.1239246
16   sa20,1 0.2919554 1.2797494

现在,我尝试在一个循环中实现此功能,以便将其重复10次并分配给同一数据帧。我想出了这个:

boot_means <- data.frame(Group.1 = rep(c(""),each=16*10),X = rep(c(as.numeric("")),Y = rep(c(as.numeric("")),each=16*10))

for (i in 1:10){
  train_resample <- setDT(train)[,.(XY=sample(XY,replace=T)),by = Group]
  train_sep <- train_resample %>% separate(XY,c("X","Y"),",") 
  train_sep$X <- as.numeric(train_sep$X)
  train_sep$Y <- as.numeric(train_sep$Y)
  resample_means <- aggregate(train_sep[,2:3],list(train_sep$Group),mean)
  print(resample_means)
  boot_means[i] <- resample_means
}

它对“ print(resample_means)”有效,在这里我得到了预期的输出。但是,当我看一下boot_means时,循环将Group变量分配给了所有列。

> head(boot_means)
  Group.1      X      Y
1  fa05,0 fa05,0
2  fa05,1 fa05,1
3  fa10,0 fa10,0
4  fa10,1 fa10,1
5  fa20,0 fa20,0
6  fa20,1 fa20,1

这不是我想要的!你能帮我吗?

解决方法

制作boot_means列表并将数据框存储在其中。

library(data.table)

boot_means <- vector('list',10)

for (i in 1:10){
  train_resample <- setDT(train)[,.(XY=sample(XY,replace=T)),by = Group]
  train_sep <- train_resample %>% tidyr::separate(XY,c("X","Y"),",") 
  train_sep$X <- as.numeric(train_sep$X)
  train_sep$Y <- as.numeric(train_sep$Y)
  resample_means <- aggregate(train_sep[,2:3],list(train_sep$Group),mean)
  boot_means[[i]] <- resample_means
}
#If you want everything in one dataframe.
combined_data <- rbindlist(boot_means)