从射线获取结果时出现 PicklingError

问题描述

我正在慢慢地将我非常序列化的文本分析引擎转换为使用 Modin 和 Ray。感觉就像我快到了,但是,我似乎遇到了绊脚石。我的代码如下所示:

vectorizer = TfidfVectorizer(
    analyzer=ngrams,encoding="ascii",stop_words="english",strip_accents="ascii"
)
tf_idf_matrix = vectorizer.fit_transform(r_strings["name"])
r_vectorizer = ray.put(vectorizer)
r_tf_idf_matrix = ray.put(tf_idf_matrix)
n = 2
match_results = []
for fn in files["c.file"]:
    match_results.append(
        match_name.remote(fn,r_vectorizer,r_tf_idf_matrix,r_strings,n)
    )
match_returns = ray.get(match_results)

我遵循 "anti-patterns" section in the Ray documentation 中关于应避免什么的指导,这与“更好”模式的指导非常相似。

Traceback (most recent call last):
  File "alt.py",line 213,in <module>
    match_returns = ray.get(match_results)
  File "/home/myuser/.local/lib/python3.7/site-packages/ray/_private/client_mode_hook.py",line 62,in wrapper
    return func(*args,**kwargs)
  File "/home/myuser/.local/lib/python3.7/site-packages/ray/worker.py",line 1501,in get
    raise value.as_instanceof_cause()
ray.exceptions.RayTaskError(PicklingError): ray::match_name() (pid=23393,ip=192.168.1.173)
  File "python/ray/_raylet.pyx",line 564,in ray._raylet.execute_task
  File "python/ray/_raylet.pyx",line 565,line 1652,in ray._raylet.CoreWorker.store_task_outputs
  File "/home/myuser/.local/lib/python3.7/site-packages/ray/serialization.py",line 327,in serialize
    return self._serialize_to_msgpack(value)
  File "/home/myuser/.local/lib/python3.7/site-packages/ray/serialization.py",line 307,in _serialize_to_msgpack
    self._serialize_to_pickle5(Metadata,python_objects)
  File "/home/myuser/.local/lib/python3.7/site-packages/ray/serialization.py",line 267,in _serialize_to_pickle5
    raise e
  File "/home/myuser/.local/lib/python3.7/site-packages/ray/serialization.py",line 264,in _serialize_to_pickle5
    value,protocol=5,buffer_callback=writer.buffer_callback)
  File "/home/myuser/.local/lib/python3.7/site-packages/ray/cloudpickle/cloudpickle_fast.py",line 73,in dumps
    cp.dump(obj)
  File "/home/myuser/.local/lib/python3.7/site-packages/ray/cloudpickle/cloudpickle_fast.py",line 580,in dump
    return Pickler.dump(self,obj)
_pickle.PicklingError: args[0] from __newobj__ args has the wrong class

绝对是意料之外的结果。我不知道下一步该往哪里走,希望对 Ray 和 Modin 有更多经验的人提供帮助。

解决方法

暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!

如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@)