如何使用 ffmpeg gpu 编码将视频中的帧保存到内存中？

问题描述

我正在尝试从视频中提取帧并将它们保存到内存（ram）中。使用 cpu 编码，我没有任何问题：

ffmpeg -i input -s 224x224 -pix_fmt bgr24 -vcodec rawvideo -an -sn -f image2pipe -

但是当我尝试使用一些 NVIDIA GPU 编码时，我总是得到嘈杂的图像。我尝试使用不同的命令，但结果总是一样的，在 Windows 和 Ubuntu 上。

ffmpeg -hwaccel cuda -i 12.mp4 -s 224x224 -f image2pipe - -vcodec rawvideo

将 JPG 保存在磁盘上，我没有任何问题。

ffmpeg -hwaccel cuvid -c:v h264_cuvid -resize 224x224 -i {input_video} \
     -vf thumbnail_cuda=2,hwdownload,format=nv12 {output_dir}/%d.jpg

有我的 python 代码来测试这些命令：

import cv2
import subprocess as sp
import numpy

IMG_W = 224
IMG_H = 224
input = '12.mp4'

ffmpeg_cmd = [ 'ffmpeg','-i',input,'-s','224x224','-pix_fmt','bgr24','-vcodec','rawvideo','-an','-sn','-f','image2pipe','-']


#ffmpeg_cmd = ['ffmpeg','-hwaccel','cuda','12.mp4','-','rawvideo']

pipe = sp.Popen(ffmpeg_cmd,stdout = sp.PIPE,bufsize=10)
images = []
encode_param = [int(cv2.IMWRITE_JPEG_QUALITY),95]
cnt = 0
while True:
    cnt += 1
    raw_image = pipe.stdout.read(IMG_W*IMG_H*3)
    image =  numpy.fromstring(raw_image,dtype='uint8')     # convert read bytes to np
    if image.shape[0] == 0:
        del images
        break   
    else:
        image = image.reshape((IMG_H,IMG_W,3))
        

    cv2.imshow('test',image)
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

pipe.stdout.flush()
cv2.destroyAllWindows()

解决方法

为了加速 H.264 解码，最好选择 -c:v h264_cuvid - 它使用 GPU 中的专用视频硬件。

使用 GPU-Z 监控软件进行测试，看起来 -hwaccel cuda 也使用了专用加速器（与 -c:v h264_cuvid 相同），但我不确定。

注意：

NVIDIA CUVID 视频解码加速器并不支持所有尺寸和像素格式。

问题：

bufsize=10 太小，最好不要设置 bufsize 参数而不是设置它 bufsize=10。
使用 '-f','image2pipe' 代替 '-f','rawvideo'（我们从管道读取原始视频帧，而不是图像 [如 JPEG 或 PNG]）。
我们可以在使用 '-vcodec','rawvideo' 时删除 '-f','rawvideo'。
我们不需要参数 '-s','224x224'，因为从输入视频中可以知道输出大小。

更新了 FFmpeg 命令：

ffmpeg_cmd = ['ffmpeg','-hwaccel','cuda','-c:v','h264_cuvid','-i',input,'-pix_fmt','bgr24','-f','rawvideo','-']

为了创建可重现的代码示例，我首先创建了一个合成视频文件 'test.mp4'，它将用作输入：

# Build synthetic video file for testing.
################################################################################
sp.run(['ffmpeg','-y','lavfi',f'testsrc=size={IMG_W}x{IMG_H}:rate=1','sine=frequency=300','libx264','nv12','-c:a','aac','-ar','22050','-t','50',input])
################################################################################

这是一个完整的（可执行的）代码示例：

import cv2
import subprocess as sp
import numpy


IMG_W = 224
IMG_H = 224
input = 'test.mp4'

# Build synthetic video file for testing.
################################################################################
sp.run(['ffmpeg',input])
################################################################################

# There is no damage using both '-hwaccel cuda' and '-c:v 'h264_cuvid'.
ffmpeg_cmd = ['ffmpeg','-']
   
pipe = sp.Popen(ffmpeg_cmd,stdout=sp.PIPE)

cnt = 0
while True:
    cnt += 1
    raw_image = pipe.stdout.read(IMG_W*IMG_H*3)
    image =  numpy.fromstring(raw_image,dtype='uint8')     # convert read bytes to np
    if image.shape[0] == 0:
        break
    else:
        image = image.reshape((IMG_H,IMG_W,3))
        
    cv2.imshow('test',image)
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

pipe.stdout.close()
pipe.wait()
cv2.destroyAllWindows()

ffmpeg ffmpeg ffmpeg jpeg nvidia python video video