在后台任务中使用多个工作程序-Fast-API

问题描述

我正在尝试处理用户上传的文件。但是,我希望用户在上传完成后得到响应并终止连接,但继续处理文件。因此,我正在使用BackgroundTasks.add_tasks,我的代码如下所示:

class Line(BaseModel):
    line: str

@app.post("/foo")
async def foo(line: Line):
""" Processing line generate results"""

    ...

    result = ... # processing line.line
    print(results)
    return results

@app.post("/upload")
async def upload(background_tasks: BackgroundTasks,csv: UploadFile = File(...)):

    background_tasks.add_task(process,csv)
    return response.text("CSV has been uploaded successfully")


async def process(csv):
    """ Processing CSV and generate data"""

    tasks = [foo(line) for line in csv]
    result = await asyncio.gather(*tasks)

不幸的是,上面的代码仅一对一执行。而且,我必须等到所有结果都处理完后,然后 foo 中的print语句才能工作,即假设我在csv中有n行,在处理完所有n后就是当我看到print语句时对全部。我的程序可在20个工作程序上运行,但是在运行此进程时,它仅占用大约1%的CPU(foo不是计算任务,它更多是IO /网络绑定任务)。这使我认为后台进程仅在1个worker上运行。我确实尝试了如下的ProcessPoolExecutor:

loop = asyncio.get_event_loop()
lines = [line_0,line_1,...,line_n] # Extracted all lines from CSV
with ProcessPoolExecutor() as executor:
    results = [loop.run_in_executor(executor,lambda: foo(line)) for line in lines]
    results = loop.run_until_complete(*results)

但是,出现以下错误:

processpoolexecutor无法腌制本地对象

我确实通过更改方法来克服了该错误 来自:

results = [loop.run_in_executor(executor,lambda: foo(line)) for line in lines]

收件人:

results = [asyncio.ensure_future(foo(line=Line(line)) for line in lines]

但是,我得到这个错误:

uvloop.loop.Loop.run_in_executor中的文件“ uvloop / loop.pyx”,第2658行 AttributeError:“循环”对象没有属性“提交”

总结:要处理一行,我可以点击“ / foo” 端点。现在,我要处理200行的csv。因此,首先我接受用户的文件并返回成功消息并终止该连接。然后将csv添加到后台任务,该任务应将每行映射到“ / foo” 端点,并为我提供每行的结果。但是,到目前为止,我尝试过的所有方法似乎只使用一个线程,并且正在逐行处理每一行。我想要一种可以一起处理多行的方法,几乎​​就像我同时击打“ / foo” 端点一样,就像我们可以使用Apache JMeter这样的工具一样。

解决方法

您可以在不使用端点的情况下并行进行处理。 以下是基于您的代码的简化示例(不使用foo端点):

import asyncio
import sys
import uvicorn
from fastapi import FastAPI,BackgroundTasks,UploadFile,File
from loguru import logger


logger.remove()
logger.add(sys.stdout,colorize=True,format="<green>{time:HH:mm:ss}</green> | {level} | <level>{message}</level>")

app = FastAPI()


async def async_io_bound(line: str):
    await asyncio.sleep(3)  # Pretend this is IO operations
    return f"Line '{line}' processed"


async def process(csv):
    """ Processing CSV and generate data"""
    tasks = [async_io_bound(line) for line in csv]
    logger.info("start processing")
    result = await asyncio.gather(*tasks)
    for i in result:
        logger.info(i)


@app.post("/upload-to-process")
async def upload(background_tasks: BackgroundTasks,csv: UploadFile = File(...)):
    background_tasks.add_task(process,csv.file)
    return {"result": "CSV has been uploaded successfully"}

if __name__ == "__main__":
    uvicorn.run("app3:app",host="localhost",port=8001)

输出示例(所有行都并行处理):

INFO:     ::1:52358 - "POST /upload-to-process HTTP/1.1" 200 OK
13:21:31 | INFO | start processing
13:21:34 | INFO | Line 'b'one,two\n'' processed
13:21:34 | INFO | Line 'b'0,1\n'' processed
13:21:34 | INFO | Line 'b'1,1\n'' processed
13:21:34 | INFO | Line 'b'2,1\n'' processed
13:21:34 | INFO | Line 'b'3,1\n'' processed
13:21:34 | INFO | Line 'b'4,1\n'' processed
13:21:34 | INFO | Line 'b'5,1\n'' processed
13:21:34 | INFO | Line 'b'6,1\n'' processed
13:21:34 | INFO | Line 'b'7,1\n'' processed
13:21:34 | INFO | Line 'b'8,1\n'' processed
13:21:34 | INFO | Line 'b'9,1\n'' processed

相关问答

依赖报错 idea导入项目后依赖报错,解决方案:https://blog....
错误1:代码生成器依赖和mybatis依赖冲突 启动项目时报错如下...
错误1:gradle项目控制台输出为乱码 # 解决方案:https://bl...
错误还原:在查询的过程中,传入的workType为0时,该条件不起...
报错如下,gcc版本太低 ^ server.c:5346:31: 错误:‘struct...