问题描述
作为序言,我利用蒙特卡洛树搜索来运行基于模型的强化学习任务。基本上,我有一个代理在分散的环境中觅食,该代理可以看到其周围的一些空间(为简单起见,我假设其对观察空间的了解非常透彻,因此观察与状态相同)。该代理具有一个由MLP表示的世界的内部转换模型(我正在使用tf.keras)。基本上,对于树中的每个步骤,我都会使用模型预测给定操作的下一个状态,然后让代理根据预测的状态变化来计算将获得的报酬。从那里开始,它是熟悉的MCTS算法,包括选择,扩展,推出和反向传播。
为了节省时间,我想同时进行多次试验。我最初尝试使用香草多处理,但是它利用了泡菜,泡菜无法序列化很多东西(包括我的代码)。因此,我正在使用pathos.multiprocessing,因为它使用了莳萝,显然可以解决此问题。但是,当我运行代码时,而不是伴随香草多处理的“无法腌制”错误,我得到了(很抱歉,跟踪的时间很长,我会切出一些东西,但不确定是否相关或不,所以也许只是滚动到底部):
Traceback (most recent call last):
File "/Users/~/anaconda3/envs/discrete_foraging/lib/python3.6/site-packages/multiprocess/pool.py",line 424,in _handle_tasks
put(task)
File "/Users/~/anaconda3/envs/discrete_foraging/lib/python3.6/site-packages/multiprocess/connection.py",line 209,in send
self._send_bytes(_ForkingPickler.dumps(obj))
File "/Users/~/anaconda3/envs/discrete_foraging/lib/python3.6/site-packages/multiprocess/reduction.py",line 54,in dumps
cls(buf,protocol,*args,**kwds).dump(obj)
File "/Users/~/anaconda3/envs/discrete_foraging/lib/python3.6/site-packages/dill/_dill.py",line 446,in dump
StockPickler.dump(self,obj)
File "/Users/~/anaconda3/envs/discrete_foraging/lib/python3.6/pickle.py",line 409,in dump
self.save(obj)
File "/Users/~/anaconda3/envs/discrete_foraging/lib/python3.6/pickle.py",line 476,in save
f(self,obj) # Call unbound method with explicit self
File "/Users/~/anaconda3/envs/discrete_foraging/lib/python3.6/pickle.py",line 751,in save_tuple
save(element)
File "/Users/~/anaconda3/envs/discrete_foraging/lib/python3.6/pickle.py",line 736,line 521,in save
self.save_reduce(obj=obj,*rv)
File "/Users/~/anaconda3/envs/discrete_foraging/lib/python3.6/pickle.py",line 634,in save_reduce
save(state)
File "/Users/~/anaconda3/envs/discrete_foraging/lib/python3.6/pickle.py",obj) # Call unbound method with explicit self
File "/Users/~/anaconda3/envs/discrete_foraging/lib/python3.6/site-packages/dill/_dill.py",line 933,in save_module_dict
StockPickler.save_dict(pickler,line 821,in save_dict
self._batch_setitems(obj.items())
File "/Users/~/anaconda3/envs/discrete_foraging/lib/python3.6/pickle.py",line 847,in _batch_setitems
save(v)
File "/Users/~/anaconda3/envs/discrete_foraging/lib/python3.6/pickle.py",line 1119,in save_instancemethod0
pickler.save_reduce(MethodType,(obj.__func__,obj.__self__),obj=obj)
File "/Users/~/anaconda3/envs/discrete_foraging/lib/python3.6/pickle.py",line 610,in save_reduce
save(args)
File "/Users/~/anaconda3/envs/discrete_foraging/lib/python3.6/pickle.py",line 1408,in save_function
if not _locate_function(obj): #,pickler._session):
File "/Users/~/anaconda3/envs/discrete_foraging/lib/python3.6/site-packages/dill/_dill.py",line 856,in _locate_function
found = _import_module(obj.__module__ + '.' + obj.__name__,safe=True)
File "/Users/~/anaconda3/envs/discrete_foraging/lib/python3.6/site-packages/dill/_dill.py",in _import_module
return getattr(__import__(module,None,[obj]),obj)
ValueError: Empty module name
我认为它可能与为什么不能在香草多处理下腌制有关,但我不确定。这是我要运行的代码的相关部分:
def trial_runner(args):
# the function I'm trying to parallelize
if __name__ == '__main__':
# generate the environment
env = MultiAgentEnv()
# acquire some sample data
# create a list of tuple of arguments of length equal to the number of trials
input_data = [
(env,env.world,env.world.agent,env.world.agent.input_elev_memory,env.world.agent.input_food_memory,env.world.agent.input_energy_memory,env.world.agent.input_action_memory,env.world.agent.output_elev_memory,env.world.agent.output_food_memory,env.world.agent.output_history) for _ in range(ep.num_trials)]
# run the pool
results = ProcessingPool().map(trial_runner,input_data)
trial_runner是我要并行化的函数,它按照前言中的描述运行算法。我有两个想法:
任何帮助将不胜感激。
解决方法
暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!
如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。
小编邮箱:dio#foxmail.com (将#修改为@)