与Ray并行服务Tensorflow模型

问题描述

我正在使用ray.serve查看这个StackOverflow线程,以并行保存一个已保存的TF模型: https://stackoverflow.com/a/62459372

我尝试了以下类似操作:

import ray
from ray import serve; serve.init()
import tensorflow as tf

class A:
    def __init__(self):
        self.model = tf.constant(1.0) # dummy example

   @serve.accept_batch
    def __call__(self,*,input_data=None):
        print(input_data) # test if method is entered
        # do stuff,serve model

if __name__ == '__main__':
    serve.create_backend("tf",A,# configure resources
        ray_actor_options={"num_cpus": 2},# configure replicas
        config={
            "num_replicas": 2,"max_batch_size": 24,"batch_wait_timeout": 0.1
        }
    )
    serve.create_endpoint("tf",backend="tf")
    handle = serve.get_handle("tf")

    args = [1,2,3]

    futures = [handle.remote(input_data=i) for i in args]
    result = ray.get(futures)

但是,出现以下错误TypeError: __call__() takes 1 positional argument but 2 positional arguments (and 1 keyword-only argument) were given。传递到__call__的参数有问题。

这似乎是一个简单的错误,如何更改args数组,以便实际上输入__call__方法

解决方法

用于Ray 1.0的API已更新。请参阅迁移指南https://gist.github.com/simon-mo/6d23dfed729457313137aef6cfbc7b54

对于您发布的特定代码示例,可以将其更新为:

import ray
from ray import serve
import tensorflow as tf

class A:
    def __init__(self):
        self.model = tf.Constant(1.0) # dummy example

   @serve.accept_batch
    def __call__(self,requests):
        for req in requests:
            print(req.data) # test if method is entered
        
        # do stuff,serve model

if __name__ == '__main__':
    client = serve.start()
    client.create_backend("tf",A,# configure resources
        ray_actor_options={"num_cpus": 2},# configure replicas
        config={
            "num_replicas": 2,"max_batch_size": 24,"batch_wait_timeout": 0.1
        }
    )
    client.create_endpoint("tf",backend="tf")
    handle = client.get_handle("tf")

    args = [1,2,3]

    futures = [handle.remote(i) for i in args]
    result = ray.get(futures)