如何在R中进行仿真设置种子以在Windows OS上实现可重复性

问题描述

我在R中使用以下功能进行了仿真:

## Load packages and prepare multicore process
library(forecast)
library(future.apply)
plan(multisession)
library(parallel)
library(foreach)
library(doParallel)
n_cores <- detectCores()
cl <- makeCluster(n_cores)
registerDoParallel(cores = detectCores())
set.seed(1)
bootstrap1 <- function(n,phi){
  ts <- arima.sim(n,model = list(ar=phi,order = c(1,1,0)),sd = 1)
  #ts <- numeric(n)
  #ts[1] <- rnorm(1)
  #for(i in 2:length(ts))
  #  ts[i] <- 2 * ts[i - 1] + rnorm(1)
  ########################################################
  ## create a vector of block sizes
  t <- length(ts)    # the length of the time series
  lb <- seq(n-2)+1   # vector of block sizes to be 1 < l < n (i.e to be between 1 and n exclusively)
  ########################################################
  ## This section create matrix to store block means
  BOOTSTRAP <- matrix(nrow = 1,ncol = length(lb))
  colnames(BOOTSTRAP) <-lb
  #BOOTSTRAP <- list(length(lb))
  ########################################################
  ## This section use foreach function to do detail in the brace
  BOOTSTRAP <- foreach(b = 1:length(lb),.combine = 'cbind') %dopar%{
    l <- lb[b]# block size at each instance 
    m <- ceiling(t / l)                                 # number of blocks
    blk <- split(ts,rep(1:m,each=l,length.out = t))  # divides the series into blocks
    ######################################################
    res<-sample(blk,replace=T,1000)        # resamples the blocks
    res.unlist <- unlist(res,use.names = FALSE)   # unlist the bootstrap series
    train <- head(res.unlist,round(length(res.unlist) - 10)) # Train set
    test <- tail(res.unlist,length(res.unlist) - length(train)) # Test set
    nfuture <- forecast::forecast(train,model = forecast::auto.arima(train),lambda=0,biasadj=TRUE,h = length(test))$mean        # makes the `forecast of test set
    RMSE <- Metrics::rmse(test,nfuture)      # RETURN RMSE
    BOOTSTRAP[b] <- RMSE
  }
  BOOTSTRAPS <- matrix(BOOTSTRAP,nrow = 1,ncol = length(lb))
  colnames(BOOTSTRAPS) <- lb
  BOOTSTRAPS
  return(list("BOOTSTRAPS" = BOOTSTRAPS))
}

我使用for loop将其结果打印三遍。

for (i in 1:3)  { set.seed(1)
  print(bootstrap1(10,0.5))
}

我得到以下结果:

##            2        3        4         5         6        7         8        9
##[1,] 1.207381 1.447382 1.282099 0.9311434 0.8481634 1.006494 0.9829584 1.205194
##            2        3       4        5         6        7        8        9
##[1,] 1.404846 1.262756 1.50738 1.188452 0.8981125 1.001651 1.349721 1.579556
##            2        3        4        5         6       7         8        9
##[1,] 1.265196 1.080703 1.074807 1.430653 0.9166268 1.12537 0.9492137 1.201763

如果必须多次运行,将会得到不同的结果。

我想要设置种子的方式,以便三轮回合是不同的,而如果我使用设置的种子运行,我将使用R得到相同的三分结果。 / strong>

解决方法

我们可以在kind中指定set.seed。如果我们在循环中执行此操作,它将返回相同的值

for (i in 1:3)  {
    set.seed(1,kind = "L'Ecuyer-CMRG")
   print(bootstrap1(10,0.5))
 }
#$BOOTSTRAPS
#            2        3        4        5        6        7        8        9
#[1,] 4.189426 6.428085 3.672116 3.893026 2.685741 3.821201 3.286509 4.062811

#$BOOTSTRAPS
#            2        3        4        5        6        7        8        9
#[1,] 4.189426 6.428085 3.672116 3.893026 2.685741 3.821201 3.286509 4.062811

如果要在for循环中为每次迭代返回不同的值,并在后续运行中获得相同的结果,请在循环外指定set.seed

1)首次运行

set.seed(1,kind = "L'Ecuyer-CMRG")
for (i in 1:3)  {    
    print(bootstrap1(10,0.5))
  }
#$BOOTSTRAPS
#            2        3        4        5        6        7        8        9
#[1,] 4.189426 6.428085 3.672116 3.893026 2.685741 3.821201 3.286509 4.062811

#$BOOTSTRAPS
#            2        3        4       5        6        7        8        9
#[1,] 1.476428 1.806258 2.071091 2.09906 2.014298 1.032776 2.573738 1.831142

#$BOOTSTRAPS
#            2        3        4        5       6        7        8        9
#[1,] 2.248546 1.838302 2.345557 1.696614 2.06357 1.502569 1.912556 1.906049

2)第二次运行

set.seed(1,] 2.248546 1.838302 2.345557 1.696614 2.06357 1.502569 1.912556 1.906049

根据?set.seed

“ L'Ecuyer-CMRG”:- 来自L'Ecuyer(1999)的“组合多重递归生成器”,其每个元素都是具有三个整数元素的反馈乘法生成器:因此,种子是长度为6的(有符号)整数向量。周期约为2 ^ 191。种子的6个元素在内部被视为32位无符号整数。前三个和后三个都不应该全为零,并且分别限制为小于4294967087和4294944443。这本身并不是特别有趣,但是它为包并行中使用的多个流提供了基础。

相关问答

错误1:Request method ‘DELETE‘ not supported 错误还原:...
错误1:启动docker镜像时报错:Error response from daemon:...
错误1:private field ‘xxx‘ is never assigned 按Alt...
报错如下,通过源不能下载,最后警告pip需升级版本 Requirem...