问题描述
NumPy在Parallel Random Number Generation上的文档显示了如何使用SeedSequence来生成孙子种子(见下文)。
from numpy.random import SeedSequence,default_rng ss = SeedSequence(12345) # Spawn off 10 child SeedSequences to pass to child processes. child_seeds = ss.spawn(10) streams = [default_rng(s) for s in child_seeds]
子SeedSequence对象也可以生成孙子对象,并且 以此类推。每个SeedSequence在生成树中都有其位置 SeedSequence对象与用户提供的种子混合在一起以生成 独立的(很有可能)流。
grandchildren = child_seeds[0].spawn(4) grand_streams = [default_rng(s) for s in grandchildren]
我的问题
要创建下一代种子,我应该使用:
great_grandchildren = grandchildren[0].spawn(4)
great_grand_streams = [default_rng(s) for s in great_grandchildren]
还是应该始终引用child_seeds[0]
:
great_grandchildren = child_seeds[0].spawn(4)
great_grand_streams = [default_rng(s) for s in great_grandchildren]
我的问题的上下文涉及实现种子和一个由concurrent.futures.ProcesspoolExecutor
对象组成的函数,该对象在while循环场景(可能是“无尽”)中为每个进程使用种子。我想知道以下是否是从SeedSequence
生成种子的正确方法,假设我已经消耗了NumPy示例中提到的grandchildren
和grand_streams
术语。例如:
from numpy.random import SeedSequence,default_rng
ss = SeedSequence(12345)
# Spawn off 10 child SeedSequences to pass to child processes.
child_seeds = ss.spawn(10)
streams = [default_rng(s) for s in child_seeds]
run_func1( streams ) #child_seeds is consummed
grandchildren = child_seeds[0].spawn(4)
grand_streams = [default_rng(s) for s in grandchildren]
while True:
run_concurrent_futures_ProcesspoolExecutor_func( grand_streams )
if condition_not_met:
grandchildren = grandchildren[0].spawn(4) #Do I use grandchildren[0] or child_seeds[0] to ensure randomness?
grand_streams = [default_rng(s) for s in grandchildren]
else:
break
解决方法
没关系。您正在构建一棵树,它的结构无关紧要,唯一的不同是树的结局是3层还是2层。
, spawn
旨在为并行进程创建独立的RNG。但是,您没有并行的过程:它是顺序的,因为您每次都要检查条件。所以不管你做什么。
请注意,您可以继续从每个序列中产生新的序列,因此可以将代码更改为:
from numpy.random import SeedSequence,default_rng
ss = SeedSequence(12345)
# Spawn off 10 child SeedSequences to pass to child processes.
child_seeds = ss.spawn(10)
streams = [default_rng(s) for s in child_seeds]
run_func1( streams ) # child_seeds is consumed
while condition_not_met:
child_seeds = ss.spawn(4)
streams = [child_seeds (s) for s in grandchildren]
run_concurrent_futures_ProcessPoolExecutor_func(streams)
但是,实际上,您还需要考虑应该由哪个函数来决定需要多少个流。
from numpy.random import SeedSequence
ss = SeedSequence(12345)
run_func1(ss.spawn(1)[0]) # creates as many child seeds as it needs
while condition_not_met:
# creates as many child seeds as it needs
run_concurrent_futures_ProcessPoolExecutor_func(ss.spawn(1)[0])