在 Google Coral 开发板上使用 OpenCL 和 OpenCV 的工作组大小错误

问题描述

我正在尝试在 Coral 开发板上通过 OpenCV 使用 OpenCL 加速。在 UMat 对象上使用 cv2.normalize() 函数时出现以下错误

OpenCL error CL_INVALID_WORK_GROUP_SIZE (-54) during call: clEnqueueNDRangeKernel('minmaxloc',dims=1,globalsize=1024x1x1,localsize=1024x1x1) sync=true

此外,任何涉及 UMats 的任务都运行得非常缓慢,而且 cpu 似乎比它应该更努力地工作,所以我不确定任何 GPU 加速是否有效。

我通过 Pip (python3 -m pip install opencv-contrib-python) 为 Python 3.7 安装了 OpenCV 4.5.1 并运行 cv2.getBuildinformation() 提供以下有关 OpenCL 的信息:

OpenCL:               YES (no extra features)
Include path:         /tmp/pip-req-build-qmcu8eer/opencv/3rdparty/include/opencl/1.2

并且运行 clinfo 给了我这个:

  Platform Name                                   Vivante OpenCL Platform
  Number of devices                                 1
  Device Name                                     Vivante OpenCL Device GC7000L.6214.0000
  Device vendor                                   Vivante Corporation
  Device vendor ID                                0x564956
  Device Version                                  OpenCL 1.2 
  Driver Version                                  OpenCL 1.2 V6.4.2.256507
  Device OpenCL C Version                         OpenCL C 1.2 
  Device Type                                     GPU
  Device Profile                                  FULL_PROFILE
  Device Available                                Yes
  Compiler Available                              Yes
  Linker Available                                Yes
  Max compute units                               1
  Max clock frequency                             800MHz
  Device Partition                                (core)
    Max number of sub-devices                     0
    Supported partition types                     (n/a)
    Supported affinity domains                    (n/a)
  Max work item dimensions                        3
  Max work item sizes                             1024x1024x1024
  Max work group size                             1024
  === CL_PROGRAM_BUILD_LOG ===
  (6:0) : error : Syntax error at 'kernel'
  Preferred work group size multiple              <getWGsizes:1200: create kernel : error -45>
  Preferred / native vector sizes                 
    char                                                 4 / 4       
    short                                                4 / 4       
    int                                                  4 / 4       
    long                                                 4 / 4       
    half                                                 0 / 0        (cl_khr_fp16)
    float                                                4 / 4       
    double                                               0 / 0        (n/a)
Half-precision Floating-point support           <printDeviceInfo:68: get  CL_DEVICE_HALF_FP_CONfig : error -30>
Single-precision Floating-point support         (core)
  Denormals                                     No
  Infinity and NANs                             Yes
  Round to nearest                              Yes
  Round to zero                                 Yes
  Round to infinity                             No
  IEEE754-2008 fused multiply-add               No
  Support is emulated in software               No
  Correctly-rounded divide and sqrt operations  No
Double-precision Floating-point support         (n/a)
Address bits                                    32,Little-Endian
Global memory size                              268435456 (256MiB)
Error Correction support                        Yes
Max memory allocation                           134217728 (128MiB)
Unified memory for Host and Device              Yes
Minimum alignment for any data type             128 bytes
Alignment of base address                       2048 bits (256 bytes)
Global Memory cache type                        Read/Write
Global Memory cache size                        8192 (8KiB)
Global Memory cache line size                   64 bytes
Image support                                   Yes
  Max number of samplers per kernel             16
  Max size for 1D images from buffer            65536 pixels
  Max 1D or 2D image array size                 8192 images
  Max 2D image size                             8192x8192 pixels
  Max 3D image size                             8192x8192x8192 pixels
  Max number of read image args                 128
  Max number of write image args                8
Local memory type                               Global
Local memory size                               32768 (32KiB)
Max number of constant args                     9
Max constant buffer size                        65536 (64KiB)
Max size of kernel argument                     1024
Queue properties                                
  Out-of-order execution                        Yes
  Profiling                                     Yes
Prefer user sync for interop                    Yes
Profiling timer resolution                      1000ns
Execution capabilities                          
  Run OpenCL kernels                            Yes
  Run native kernels                            No
printf() buffer size                            1048576 (1024KiB)
Built-in kernels                                (n/a)
Device Extensions                               cl_khr_byte_addressable_store cl_khr_gl_sharing cl_khr_fp16 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics 

我没有从源代码或任何东西构建 OpenCL...任何未随开发板映像提供的 OpenCL 软件包,我都会在准备安装 OpenCV 时通过 apt 安装。我的深度不够——任何建议都值得赞赏!

解决方法

暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!

如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@)