如何将自定义函数加载到 R 中的 foreach 循环中?

问题描述

我正在尝试运行具有特定空间相关结构的 gls 模型,该结构来自修改 nlme 包/从此 post 在全局环境中构建新函数(这篇文章的答案创建了允许用于相关结构的实现)。不幸的是,当我通过 foreach 循环运行它时,我无法让这个空间相关结构起作用:

#setup example data
data("mtcars")
mtcars$lon = runif(nrow(mtcars)) #include lon and lat for the new correlation structure
mtcars$lat = runif(nrow(mtcars))
mtcars$marker = c(rep(1,nrow(mtcars)/2),rep(2,nrow(mtcars)/2)) #values for iterations

#set up cluster
detectCores()
cl <- parallel::makeCluster(6,setup_strategy = "sequential")
doParallel::registerDoParallel(cl)

#run model
list_models<-foreach(i=1:2,.packages=c('nlme'),.combine = cbind,.export=ls(.GlobalEnv)) %dopar% {
                    
                       .GlobalEnv$i <- i
                       
                       model_trial<-gls(disp ~ wt,correlation = corhaversine(form=~lon+lat,mimic="corSpher"),data = mtcars)
                     }


stopCluster(cl)

当我运行它时,我收到错误消息:

Error in { : 
  task 1 Failed - "do not kNow how to calculate correlation matrix of “corhaversine” object"
In addition: Warning message:
In e$fun(obj,substitute(ex),parent.frame(),e$data) :
  already exporting variable(s): corhaversine,mtcars,path_df1

该模型在添加相关结构的情况下运行良好:

correlation = corhaversine(form=~lon+lat,mimic="corSpher")

在正常循环中。任何帮助将不胜感激!

解决方法

我不确定为什么您的 foreach 方法不起作用,而且我也不确定您实际计算的是什么。无论如何,您可以使用似乎有效的 parallel::parLapply() 尝试这种替代方法:

首先,我使用 rm(list=ls()) 清除了工作区,然后我运行了 this answer 的整个第一个代码块,其中他们创建了 "corStruct" 类和 corHaversine 方法以将其作为以及下面的数据,准备好clusterExport()

library(parallel)
cl <- makeCluster(detectCores() - 1)
clusterEvalQ(cl,library(nlme))
clusterExport(cl,ls())
r <- parLapply(cl=cl,X=1:2,fun=function(i) {
  gls(disp ~ wt,correlation=corHaversine(form= ~ lon + lat,mimic="corSpher"),data=mtcars)
})
stopCluster(cl)  ## stop cluster
r  ## result
# [[1]]
# Generalized least squares fit by REML
# Model: disp ~ wt 
# Data: mtcars 
# Log-restricted-likelihood: -166.6083
# 
# Coefficients:
#   (Intercept)          wt 
# -122.4464    110.9652 
# 
# Correlation Structure: corHaversine
# Formula: ~lon + lat 
# Parameter estimate(s):
#   range 
# 10.24478 
# Degrees of freedom: 32 total; 30 residual
# Residual standard error: 58.19052 
# 
# [[2]]
# Generalized least squares fit by REML
# Model: disp ~ wt 
# Data: mtcars 
# Log-restricted-likelihood: -166.6083
# 
# Coefficients:
#   (Intercept)          wt 
# -122.4464    110.9652 
# 
# Correlation Structure: corHaversine
# Formula: ~lon + lat 
# Parameter estimate(s):
#   range 
# 10.24478 
# Degrees of freedom: 32 total; 30 residual
# Residual standard error: 58.19052 

数据:

set.seed(42)  ## for sake of reproducibility
mtcars <- within(mtcars,{
  lon <- runif(nrow(mtcars))
  lat <- runif(nrow(mtcars))
  marker <- c(rep(1,nrow(mtcars)/2),rep(2,nrow(mtcars)/2))
})