在R中使用for循环的df中的子集列

问题描述

我想将数据集分成几个较小的df 在for循环中使用列名。

创建一些数据(这样做可能更容易...)

Line1 <- c(1:7)
Line2 <- c(3:9)
df <- rbind(Line1,Line2)
ColumnNames <- c(paste0("Var",1:7))
df <- lapply(df,setNames,ColumnNames)
colnames(df)=ColumnNames

作为输出,我希望具有三个数据框: Df_Sub1,Df_Sub2和Df_Sub3。

Sub1 <- c("Var1","Var2","Var3")
Sub2 <- c("Var3","Var4")
Sub3 <- c("Var5","Var6","Var7")

Subs <- c("Sub1","Sub2","Sub3")

For循环创建三个子集

for (Sub in Subs) { 
  name <- as.name(paste0("df_",Sub))
  df_ <- df[,colnames(df) %in% get(Sub)]
}
 

如何为三个Subs中的每一个在df_之后加上名称(或做其他事情以使其起作用)?

解决方法

我们可以使用mget来返回list中的对象值,在list上循环,选择'df'列,并使用{ {1}}

list2env

-输出

list2env(lapply(setNames(mget(Subs),paste0("Df_",Subs)),function(x) df[,x]),.GlobalEnv)
,

循环选项意味着将向量存储在列表中

#Data
Line1 <- c(1:7)
Line2 <- c(3:9)
df <- rbind(Line1,Line2)
ColumnNames <- c(paste0("Var",1:7))
colnames(df)=ColumnNames
#Data 2
Sub1 <- c("Var1","Var2","Var3")
Sub2 <- c("Var3","Var4")
Sub3 <- c("Var5","Var6","Var7")
Subs <- c("Sub1","Sub2","Sub3")
#Store in a list
List <- list(Sub1=Sub1,Sub2=Sub2,Sub3=Sub3)
#List for data
List2 <- list()
#Loop
for(i in Subs)
{
  List2[[i]] <- df[,List[[i]],drop=F]
}
#Format names
names(List2) <- paste0('df_',names(List2))
#Set to envir
list2env(List2,envir = .GlobalEnv)

输出:

df_Sub1
      Var1 Var2 Var3
Line1    1    2    3
Line2    3    4    5

df_Sub2
      Var3 Var4
Line1    3    4
Line2    5    6

df_Sub3
      Var5 Var6 Var7
Line1    5    6    7
Line2    7    8    9