只能在我的 CPU 上运行 ArrayFire,不能在集成 GPU

问题描述

我可以构建 ArrayFire 项目的所有示例(CUDA 示例除外,具有 AMD APU)。但是,只有在 cpu 上运行的程序才能正常工作;基于 GPU 的有问题。

示例:

benchmarks> ls
CMakeFiles  blas_cpu  cg_cpu  cmake_install.cmake  fft_cpu  fft_opencl  Makefile  pi_cpu

这是cpu版本:

benchmarks> ./fft_cpu 
ArrayFire v3.8.0 (cpu,64-bit Linux,build d99887a)
[0] AMD: AMD Ryzen 7 2700U with Radeon Vega Mobile Gfx  Benchmark N-by-N 2D fft
 128 x  128:     3 Gflops
 256 x  256:     4 Gflops
 512 x  512:     4 Gflops
1024 x 1024:     4 Gflops
2048 x 2048:     5 Gflops
4096 x 4096:     5 Gflops

这是使用 GPU 版本运行(以详细模式运行):

benchmarks> AF_PRINT_ERRORS=1 AF_JIT_KERNEL_TRACE=stdout AF_TRACE=all ./fft_opencl  
[platform][1626887645][022941] [ ../src/backend/common/DependencyModule.cpp:99 ] Attempting to load: libforge.so
[platform][1626887645][022941] [ ../src/backend/common/DependencyModule.cpp:102 ] Found: libforge.so
[platform][1626887645][022941] [ ../src/backend/opencl/device_manager.cpp:218 ] Found 1 OpenCL platforms
[platform][1626887645][022941] [ ../src/backend/opencl/device_manager.cpp:230 ] Found 1 devices on platform Clover
[platform][1626887645][022941] [ ../src/backend/opencl/device_manager.cpp:235 ] Found device AMD Radeon(TM) Vega 3 Graphics (RAVEN,DRM 3.40.0,5.12.11-zen1-1-zen,LLVM 12.0.0) on platform
Clover
[platform][1626887645][022941] [ ../src/backend/opencl/device_manager.cpp:240 ] Found 1 OpenCL devices
[platform][1626887646][022941] [ ../src/backend/opencl/device_manager.cpp:335 ] Default device: 0
ArrayFire v3.8.0 (OpenCL,build d99887a)
[0] Clover: AMD Radeon(TM) Vega 3 Graphics (RAVEN,LLVM 12.0.0),3072 MB
Benchmark N-by-N 2D fft
128 x  128: [mem][1626887646][022941] [ ../src/backend/opencl/memory.cpp:200 ] nativeAlloc: 64 KB 0x56337cfc7e50
[jit][1626887646][022941] [ ../src/backend/opencl/compile_module.cpp:254 ] {9348653917523335434  : loaded from /home/pietrom/.arrayfire/KER9348653917523335434_CL_4098_AMD_RADEON(TM)_VEGA_3_
GRAPHICS_(RAVEN,_DRM_3.40.0,_5.12.11-ZEN1-1-ZEN,_LLVM_12.0.0)_AF_38.bin for AMD Radeon(TM) Vega 3 Graphics (RAVEN,LLVM 12.0.0) }

使用 GPU,执行在上面打印的最后一条消息后卡住了。

这是 GPU 版本的另一个运行:

benchmarks> AF_PRINT_ERRORS=1 AF_JIT_KERNEL_TRACE=stdout AF_TRACE=all ./fft_opencl
[platform][1627145464][006841] [ ../src/backend/common/DependencyModule.cpp:99 ] Attempting to load: libforge.so
[platform][1627145464][006841] [ ../src/backend/common/DependencyModule.cpp:102 ] Found: libforge.so
[platform][1627145464][006841] [ ../src/backend/opencl/device_manager.cpp:218 ] Found 1 OpenCL platforms
[platform][1627145464][006841] [ ../src/backend/opencl/device_manager.cpp:230 ] Found 1 devices on platform Clover
[platform][1627145464][006841] [ ../src/backend/opencl/device_manager.cpp:235 ] Found device AMD Radeon(TM) Vega 3 Graphics (RAVEN,LLVM 12.0.0) on platform Clover
[platform][1627145464][006841] [ ../src/backend/opencl/device_manager.cpp:240 ] Found 1 OpenCL devices
Invalid MIT-MAGIC-COOKIE-1 keyERROR: GLFW wasn't able to initalize
[platform][1627145464][006841] [ ../src/backend/opencl/device_manager.cpp:335 ] Default device: 0
ArrayFire v3.8.0 (OpenCL,3072 MB
Benchmark N-by-N 2D fft
 128 x  128: [mem][1627145464][006841] [ ../src/backend/opencl/memory.cpp:200 ] nativeAlloc: 64 KB 0x56039668df20
[jit][1627145464][006841] [ ../src/backend/opencl/compile_module.cpp:254 ] {9348653917523335434  : loaded from /home/pietrom/.arrayfire/KER9348653917523335434_CL_4098_AMD_RADEON(TM)_VEGA_3_GRAPHICS_(RAVEN,LLVM 12.0.0) }

在最后一次运行中,产生了 Invalid MIT-MAGIC-COOKIE-1 keyERROR: GLFW wasn't able to initalize 错误消息。

在某些运行中,系统会完全崩溃,有时会在黑屏之前出现一些图形伪影。

这是一个常见问题吗?我可能做错了什么吗?我的系统上可能缺少任何东西吗?


这是堆栈跟踪:

arrayfire_tests_benchmarks> gdb -q ./fft_opencl 
Reading symbols from ./fft_opencl...
(No debugging symbols found in ./fft_opencl)
(gdb) run
Starting program: /home/pietrom/myProgs/test/arrayfire_tests_benchmarks/fft_opencl 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib/libthread_db.so.1".
[New Thread 0x7fffe0e43640 (LWP 5316)]
[New Thread 0x7fffdbfff640 (LWP 5317)]
[New Thread 0x7fffdb7fe640 (LWP 5318)]
[New Thread 0x7fffdaffd640 (LWP 5319)]
[New Thread 0x7fffda7fc640 (LWP 5320)]
[New Thread 0x7fffd9ffb640 (LWP 5321)]
[New Thread 0x7fffd97fa640 (LWP 5322)]
[New Thread 0x7fffd8ff9640 (LWP 5323)]
[New Thread 0x7fffc3fff640 (LWP 5324)]
[New Thread 0x7fffc37fe640 (LWP 5325)]
[New Thread 0x7fffc2ffd640 (LWP 5326)]
[New Thread 0x7fffc27fc640 (LWP 5327)]
[New Thread 0x7fffc1ffb640 (LWP 5328)]
[New Thread 0x7fffc17fa640 (LWP 5329)]
[New Thread 0x7fffc0ff9640 (LWP 5330)]
[New Thread 0x7fff9ffff640 (LWP 5331)]
[New Thread 0x7fff9f7fe640 (LWP 5332)]
[New Thread 0x7fff9effd640 (LWP 5333)]
[New Thread 0x7fff9e7fc640 (LWP 5334)]
[New Thread 0x7fff9dffb640 (LWP 5335)]
[New Thread 0x7fff9d7fa640 (LWP 5336)]
[New Thread 0x7fff9cff9640 (LWP 5337)]
[New Thread 0x7fff83fff640 (LWP 5338)]
[New Thread 0x7fff837fe640 (LWP 5339)]
[New Thread 0x7fff82ffd640 (LWP 5340)]
[New Thread 0x7fff827fc640 (LWP 5341)]
[New Thread 0x7fff81ffb640 (LWP 5342)]
[New Thread 0x7fff817fa640 (LWP 5343)]
[New Thread 0x7fff80ff9640 (LWP 5344)]
[New Thread 0x7fff5ffff640 (LWP 5345)]
[New Thread 0x7fff5f7fe640 (LWP 5346)]
[New Thread 0x7fff5effd640 (LWP 5347)]
[New Thread 0x7fff5e7fc640 (LWP 5348)]
[New Thread 0x7fff5dffb640 (LWP 5349)]
[New Thread 0x7fff5d7fa640 (LWP 5350)]
[Thread 0x7fff5f7fe640 (LWP 5346) exited]
[Thread 0x7fff80ff9640 (LWP 5344) exited]
[Thread 0x7fff817fa640 (LWP 5343) exited]
[Thread 0x7fff81ffb640 (LWP 5342) exited]
[Thread 0x7fff827fc640 (LWP 5341) exited]
[Thread 0x7fff82ffd640 (LWP 5340) exited]
[Thread 0x7fff5ffff640 (LWP 5345) exited]
[Thread 0x7fff837fe640 (LWP 5339) exited]
[Thread 0x7fff9d7fa640 (LWP 5336) exited]
[Thread 0x7fff9dffb640 (LWP 5335) exited]
[Thread 0x7fff9e7fc640 (LWP 5334) exited]
[Thread 0x7fff9cff9640 (LWP 5337) exited]
[Thread 0x7fff9effd640 (LWP 5333) exited]
[Thread 0x7fff9f7fe640 (LWP 5332) exited]
[Thread 0x7fff9ffff640 (LWP 5331) exited]
[Thread 0x7fff83fff640 (LWP 5338) exited]
[Thread 0x7fff5d7fa640 (LWP 5350) exited]
[Thread 0x7fff5dffb640 (LWP 5349) exited]
[Thread 0x7fff5e7fc640 (LWP 5348) exited]
[Thread 0x7fff5effd640 (LWP 5347) exited]
Invalid MIT-MAGIC-COOKIE-1 keyERROR: GLFW wasn't able to initalize
ArrayFire v3.8.0 (OpenCL,3072 MB
Benchmark N-by-N 2D fft
^C--Type <RET> for more,q to quit,c to continue without paging--

Thread 1 "fft_opencl" received signal SIGINT,Interrupt.
0x00007ffff519ae6b in ioctl () from /usr/lib/libc.so.6
(gdb) backtrace
#0  0x00007ffff519ae6b in ioctl () from /usr/lib/libc.so.6
#1  0x00007fffea1edb69 in drmIoctl () from /usr/lib/libdrm.so.2
#2  0x00007fffe1561348 in amdgpu_cs_query_fence_status () from /usr/lib/libdrm_amdgpu.so.1
#3  0x00007fffe181836e in ?? () from /usr/lib/gallium-pipe/pipe_radeonsi.so
#4  0x00007fffe17f0fc5 in ?? () from /usr/lib/gallium-pipe/pipe_radeonsi.so
#5  0x00007fffed6c10d9 in ?? () from /usr/lib/libmesaOpenCL.so.1
#6  0x00007fffed6a54e3 in ?? () from /usr/lib/libmesaOpenCL.so.1
#7  0x00007ffff68d506e in cl::CommandQueue::finish (this=<optimized out>) at include/CL/../CL/cl2.hpp:8117
#8  opencl::sync (device=0) at ../src/backend/opencl/platform.cpp:465
#9  0x00007ffff70aadaf in af_sync (device=device@entry=-1) at ../src/api/c/device.cpp:207
#10 0x00007ffff73c2552 in af::sync (device=device@entry=-1) at ../src/api/cpp/device.cpp:101
#11 0x00007ffff74007c8 in af::timeit (fn=0x555555555259 <fn()>) at ../src/api/cpp/timing.cpp:83
#12 0x00005555555553c3 in main ()
(gdb) 

解决方法

暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!

如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@)