推送/弹出pycuda上下文时出现CuPy错误

问题描述

我正在使用 tensorRT 通过 CUDA 执行推理。我想使用 CuPy 来预处理一些我将提供给 tensorRT 引擎的图像。只要 tensorRT 不在 my_function 方法的不同调用之间运行，称为 my_function 的预处理函数就可以正常工作（请参阅下面的代码）。具体来说，这个问题与 tensorRT 并没有严格的关系，而是因为 tensorRT 推理需要被 pycuda 上下文的 push 和 pop 操作包装起来。

关于下面的代码，最后一次执行 my_function 会报如下错误：

  File "/home/ubuntu/myfile.py",line 188,in _pre_process_cuda
    img = ndimage.zoom(img,scaling_factor)
  File "/home/ubuntu/.local/lib/python3.6/site-packages/cupyx/scipy/ndimage/interpolation.py",line 482,in zoom
    kern(input,zoom,output)
  File "cupy/core/_kernel.pyx",line 822,in cupy.core._kernel.ElementwiseKernel.__call__
  File "cupy/cuda/function.pyx",line 196,in cupy.cuda.function.Function.linear_launch
  File "cupy/cuda/function.pyx",line 164,in cupy.cuda.function._launch
  File "cupy_backends/cuda/api/driver.pyx",line 299,in cupy_backends.cuda.api.driver.launchKernel
  File "cupy_backends/cuda/api/driver.pyx",line 124,in cupy_backends.cuda.api.driver.check_status
cupy_backends.cuda.api.driver.CUDADriverError: CUDA_ERROR_INVALID_HANDLE: invalid resource handle

注意：在下面的代码中，我没有报告整个 tensorRT 推理代码。事实上，简单地推送和弹出 pycuda 上下文 会产生错误

代码：

import numpy as np
import cv2
import time
from PIL import Image
import requests
from io import BytesIO
from matplotlib import pyplot as plt
import cupy as cp
from cupyx.scipy import ndimage
import tensorrt as trt
import pycuda.driver as cuda
import pycuda.autoinit


def my_function(numpy_frame):
    dtype = 'float32'
    img = cp.array(numpy_frame,dtype='float32')
    # print(img)
    img = ndimage.zoom(img,(0.5,0.5,3))
    img = (cp.array(2,dtype=dtype) / cp.array(255,dtype=dtype)) * img - cp.array(1,dtype=dtype)
    img = img.transpose((2,1))
    img = img.ravel()
    return img


# load image
url = "https://www.pexels.com/photo/109919/download/?search_query=&tracking_id=411xe21veam"
response = requests.get(url)
img = Image.open(BytesIO(response.content))
img = np.array(img)

# initialize tensorrt
TRT_LOGGER = trt.Logger(trt.Logger.WARNING)
trt_runtime = trt.Runtime(TRT_LOGGER)
cfx = cuda.Device(0).make_context()


my_function(img)  # ok
my_function(img)  # ok

# ----- TENSORRT ---------
cfx.push()
# .... tensorrt inference....
cfx.pop()
# ----- TENSORRT ---------

my_function(img)  # <---- error

我什至尝试过其他方式，但不幸的是结果相同：

cfx.push()
my_function(img)  # ok
cfx.pop()

cfx.push()
my_function(img)  # error
cfx.pop()

@admin：如果你能想到一个更好的名字来解决这个问题，请随意编辑它:)

解决方法

打开了多个上下文。例如，似乎以下所有内容都打开了一个上下文：

import pycuda.autoinit
cfx.cuda.Device(0).make_context()
cfx.push()

因此，如果您运行上述三个命令，那么仅运行一个 cfx.pop() 是不够的。您需要运行 cfx.pop() 三次才能弹出所有上下文。

cupy pycuda