将碎片化的MP4转换为MP4

问题描述

我正在尝试从trafficview.org抓取视频帧，似乎无法弄清楚如何解码数据。

我根据此websocket_client上的教程编写了几行代码，以访问实时流式网络套接字并直接接收消息。

我已经监视了通过Chrome上的“网络”标签发送的消息，还从下面的代码中挖掘了输出，并且可以肯定的是，数据以碎片MP4的形式流入。以下是前100个左右的字节/消息：

b'\ xfa \ x00 \ x02 \ x86 \ xf1B \ xc0 \ x1e \ x00 \ x00 \ x00 \ x00 \ x18ftypiso5 \ x00 \ x00 \ x02 \ x00iso6mp41 \ x00 \ x00 \ x02jmoov \ x00 \ x00 \ x00 \ x00lmvhd \ x00 \ x00 \ x00 \ x00 \ xdb \ x7f \ xeb \ xb2 \ xdb \ x7f \ xeb \ xb2 \ x00 \ x00 \ x03 \ xe8 \ x00 \ x00 \ x00 \ x00 \ x00 \ x00 \ x01 \ x00 \ x00 \ x00 \ x01 \ x00 \ x00 \ x00 \ x00 \ x00 \ x00 \ x00 \ x00 \ x00 \ x00 \ x00 \ x00 \ x01 \ x00 \ x00 \ x00 \ x00 \ x00 \ x00 \ x00 \ x00 \ x00 \ x00 \ x00 \ x00 \ x00 \ x00 \ x00 \ x00'

在此输出中，有许多moof和mdat对。可以说我让这段代码运行了30秒钟，如何将这个原始字节字符串转换成mp4文件？

import json

from websocket import create_connection

url = 'wss://cctv.trafficview.org:8420/DDOT_CAPTOP_13.vod?progressive'

headers = json.dumps({
    'Accept-Encoding': 'gzip,deflate,br','Accept-Language': 'en-US,en;q=0.9','Cache-Control': 'no-cache','Connection': 'Upgrade','Host': 'cctv.trafficview.org:8420','Origin': 'https://trafficview.org','Pragma': 'no-cache','Sec-WebSocket-Extensions': 'permessage-deflate; client_max_window_bits','Sec-WebSocket-Key': 'FzWbrsoHFsJWzvWGJ04ffw==','Sec-WebSocket-Version': '13','Upgrade': 'websocket','User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML,like Gecko) Chrome/85.0.4183.83 Safari/537.36',})

ws = create_connection(url,headers=headers)

# Then send a message through the tunnel
ws.send('ping')

# Here you will view the message return from the tunnel
flag = 3000
output = b''
while flag > 0:
    output += ws.recv()
    flag -= 1

更新：我已经对堆栈溢出进行了一些修改，以适应于将fmp4数据管道化并将其转换为帧。为了到达那里，我注意到websocket输出的前16个字节与我检查过的其他mp4文件不一致。因此，我首先修剪了前16个字节。我也不知道这些文件中的一个应该如何结束，所以我将文件的最后一部分剪掉了。

下面的代码可以很好地读取mp4标头（也在下面），但是无法解码任何字节。

output = output[8:]

import re
moof_locs = [m.start() for m in re.finditer(b'moof',output)]

output = output[:moof_locs[-1]-1]

import subprocess as sp
import shlex

width,height = 640,480

# FFmpeg input PIPE: WebM encoded data as stream of bytes.
# FFmpeg output PIPE: decoded video frames in BGR format.
process = sp.Popen(shlex.split('/usr/bin/ffmpeg -i pipe: -f hls -hls_segment_type fmp4 -c h264 -an -sn pipe:'),stdin=sp.PIPE,stdout=sp.PIPE,bufsize=10**8)
process.stdin.write(output)
process.stdin.close()
in_bytes = process.stdout.read(width * height * 3)
in_frame = (np.frombuffer(in_bytes,np.uint8).reshape([height,width,3]))

ffmpeg的输出：

[mov,mp4,m4a,3gp,3g2,mj2 @ 0x994600] Could not find codec parameters for stream 0 (Video: h264 (avc1 / 0x31637661),none,640x480): unspecified pixel format
Consider increasing the value for the 'analyzeduration' and 'probesize' options
Input #0,mov,mj2,from 'pipe:':
  Metadata:
    major_brand     : iso5
    minor_version   : 512
    compatible_brands: iso6mp41
    creation_time   : 2020-09-11T13:40:21.000000Z
  Duration: N/A,bitrate: N/A
    Stream #0:0(und): Video: h264 (avc1 / 0x31637661),640x480,1k tbr,1k tbn,2k tbc (default)
    Metadata:
      creation_time   : 2020-09-11T13:40:21.000000Z
      encoder         : EvoStream Media Server
Stream mapping:
  Stream #0:0 -> #0:0 (h264 (native) -> h264 (libx264))
Finishing stream 0:0 without any data written to it.
Nothing was written into output file 0 (pipe:),because at least one of its streams received no packets.
frame=    0 fps=0.0 q=0.0 Lsize=       0kB time=-577014:32:22.77 bitrate=  -0.0kbits/s speed=N/A    
video:0kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: unknown
Output file is empty,nothing was encoded (check -ss / -t / -frames parameters if used)

更新2：

检查了来自websocket的流后，我意识到每条消息都以一个特定的整数开头，该整数在trafficview的javascript代码中定义。这些代码的顺序始终相同，它们的输入方式如下：

Header MOOV (250)
    PBT Begin (249)
        Video Buffer (252)
        Header MOOF (251)
        Header MOOF (251)
        Header MOOF (251)
        Header MDAT (254)
    PBT End (255)

    PBT Begin (249)
    Continues Forever

其中一些标记始终相同，例如249条消息始终为f900 0000，255条消息始终为ff00 0000。

我猜想249和255条消息通常不在零散的mp4或hls流中，因此我想我需要使用此标签信息从头开始构建正确的文件格式。

解决方法

ws = create_connection(url,headers=headers)
# Then send a message through the tunnel
ws.send('ping')

start = timeit.default_timer()
flag = True
output = []
while flag:
    output.append(ws.recv())
    if timeit.default_timer() - start > 90:
        flag = False

result = output[0][8:]

for msg in output[1:]:
    if msg[0] == 249:
        moofmdat = b''
        moof = b''
        continue

    if msg[0] == 252:
        vidbuf = msg[4:]

    if msg[0] == 251:
        moof += msg[4:]

    if msg[0] == 254:
        mdat = msg[4:]

    if msg[0] == 255:
        moofmdat += moof
        moofmdat += mdat
        moofmdat += vidbuf
        result += moofmdat

with open('test.mp4','wb') as file:
    file.write(result)

弄清楚了。 MOOV标头具有8个字节的不必要信息，必须将其删除。每个附加消息（PBT_Begin和PBT_End除外）都具有4个字节的播放器特定数据。只需清理每条消息并以正确的顺序放置。然后将原始字节另存为mp4和voila（在vlc中播放的视频）。

fmp4 java-websocket python web-scraping websocket

将碎片化的MP4转换为MP4

问题描述

解决方法

相关问答