问题描述
我想创建一个循环,用于在数据帧中存储多个变量的t检验的输出。但是,当我将不同的变量存储在带引号的向量中时,这些变量不能用于t检验,因为它们与引号一起保存。例如,R在循环中将第一个变量作为“ variable_1”,这会产生错误,因为对于t检验,我需要不带引号的变量,例如t.test(变量_1〜性别)。有人知道如何去除向量中变量名称的引号吗?
variable <- c("variable_1","variable_2","variable_3")
df <- data.frame(t_value=as.numeric(),df=as.numeric(),p_value= as.numeric(),mean_f= as.numeric(),mean_m= as.numeric())
attach(data)
for(v in variable){
output <- t.test(v ~ Gender)
values <- output[c(1,2,3,5)]
row <- round(unlist(values,use.names = FALSE),3)
df <- rbind(df,row)
}
解决方法
这是带有非标准评估和OkHttpChannelBuilder.forAddress(host,port).sslSocketFactory(sslSocketFactory).build();
的更现代的方法。我已经将循环的逻辑放入了针对FROM
(
SELECT
a.DELFLAG,a.fname,a.lname,a.key1,l.key2,l.update,ROW_NUMBER() OVER(PARTITION BY key1,key2 ORDER BY l.lupdate DESC) AS row_num
FROM dbo.t1 a
INNER JOIN dbo.t2 i ON i.key1 = a.key1
INNER JOIN dbo.t3 l ON l.key1 = i.key1
) AS K
WHERE
[row_num] = SELECT MAX([row_num]) AS latest)
AND
DELFLAG = 0
的每个条目调用的函数中。在函数内部,purrr
的值(它是一个字符串)被转换为符号。这是您的变量名。然后,在提供的data.frame上下文中,对variable
的{{1}}参数评估该变量。
v
使用@Chuck P的示例,我的方法如下:
data
,
有些更改将使其通过get
起作用。正如其他人指出的那样,在这种情况下,attach
是一个可怕的想法。因此,我以mtcars
为例,省略了它。
进行其他一些更改,以使事情变得尽可能好。通过或对@starja或@ r2evans答案使用多个变量进行t检验,可以为搜索堆栈中的大量答案提供更好的服务。
variable <- c("mpg","hp")
df <- data.frame(t_value=as.numeric(),df=as.numeric(),p_value= as.numeric(),mean_f= as.numeric(),mean_m= as.numeric())
for(v in variable){
output <- t.test(get(v) ~ am,data = mtcars)
values <- output[c(1,2,3,5)]
row <- round(unlist(values,use.names = FALSE),3)
df_row <- data.frame(t_value=row[[1]],df=row[[2]],p_value= row[[3]],mean_f= row[[4]],mean_m= row[[5]])
df <- rbind(df,df_row)
}
df
#> t_value df p_value mean_f mean_m
#> 1 -3.767 18.332 0.001 17.147 24.392
#> 2 1.266 18.715 0.221 160.263 126.846
,
如果您需要将一个变量与一帧中的所有(或一些)其他变量进行比较,则应如下所示:
vars <- c("cyl","disp","hp","gear")
do.call(
rbind.data.frame,lapply(setNames(nm = vars),function(nm) {
out <- t.test(mtcars[["mpg"]],mtcars[[nm]])
c(out[c(1,3)],out[[5]])
})
)
# statistic parameter p.value mean.of.x mean.of.y
# cyl 12.51163 36.40239 9.507708e-15 20.09062 6.1875
# disp -9.60236 31.14661 7.978234e-11 20.09062 230.7219
# hp -10.40489 31.47905 1.030354e-11 20.09062 146.6875
# gear 15.28179 31.92893 3.077106e-16 20.09062 3.6875
如果您需要比较各种对(不仅是一对),那么也许类似
vars <- c("mpg","cyl","gear")
eg <- expand.grid(vars,vars,stringsAsFactors = FALSE)
eg <- eg[ eg[,1] != eg[,2],]
head(eg)
# Var1 Var2
# 2 cyl mpg
# 3 disp mpg
# 4 hp mpg
# 5 gear mpg
# 6 mpg cyl
# 8 disp cyl
ret <- do.call(
rbind.data.frame,Map(function(x,y) {
out <- t.test(x,y)
c(out[c(1,out[[5]])
},mtcars[eg[,1]],2]])
)
ret <- cbind(eg,ret)
head(ret)
# Var1 Var2 statistic parameter p.value mean.of.x mean.of.y
# 2 cyl mpg -12.51163 36.40239 9.507708e-15 6.18750 20.09062
# 3 disp mpg 9.60236 31.14661 7.978234e-11 230.72188 20.09062
# 4 hp mpg 10.40489 31.47905 1.030354e-11 146.68750 20.09062
# 5 gear mpg -15.28179 31.92893 3.077106e-16 3.68750 20.09062
# 6 mpg cyl 12.51163 36.40239 9.507708e-15 20.09062 6.18750
# 8 disp cyl 10.24721 31.01287 1.774454e-11 230.72188 6.18750
---
Note:
1. Iteratively build a frame row-by-row works fine logically and in small doses,but in the long run it performs very poorly: it makes a complete copy of the whole frame with each row,which is memory-inefficient (and slow).
2. The use of `attach` is discouraged,as I said in my comment. Also,`get` should be avoided as well,though perhaps to a lesser degree than `attach`.