问题描述
我正在尝试开发一个移动应用来识别手语,使用以下代码使用 MediaPipe 从设备摄像头捕获帧以进行识别。
# Tracks and renders pose + hands + face landmarks.
# GPU buffer. (GpuBuffer)
input_stream: "input_video"
# GPU image with rendered results. (GpuBuffer)
output_stream: "output_video"
# Throttles the images flowing downstream for flow control. It passes through
# the very first incoming image unaltered,and waits for downstream nodes
# (calculators and subgraphs) in the graph to finish their tasks before it
# passes through another image. All images that come in while waiting are
# dropped,limiting the number of in-flight images in most part of the graph to
# 1. This prevents the downstream nodes from queuing up incoming images and data
# excessively,which leads to increased latency and memory usage,unwanted in
# real-time mobile applications. It also eliminates unnecessarily computation,# e.g.,the output produced by a node may get dropped downstream if the
# subsequent nodes are still busy processing prevIoUs inputs.
node {
calculator: "FlowLimiterCalculator"
input_stream: "input_video"
input_stream: "FINISHED:output_video"
input_stream_info: {
tag_index: "FINISHED"
back_edge: true
}
output_stream: "throttled_input_video"
node_options: {
[type.googleapis.com/mediapipe.FlowLimiterCalculatorOptions] {
max_in_flight: 1
max_in_queue: 1
# Timeout is disabled (set to 0) as first frame processing can take more
# than 1 second.
in_flight_timeout: 0
}
}
}
node {
calculator: "SlrLandmarkGpu"
input_stream: "IMAGE:throttled_input_video"
output_stream: "POSE_LANDMARKS:pose_landmarks"
output_stream: "POSE_ROI:pose_roi"
output_stream: "POSE_DETECTION:pose_detection"
output_stream: "LEFT_HAND_LANDMARKS:left_hand_landmarks"
output_stream: "RIGHT_HAND_LANDMARKS:right_hand_landmarks"
}
# Gets image size.
node {
calculator: "ImagePropertiesCalculator"
input_stream: "IMAGE_GPU:throttled_input_video"
output_stream: "SIZE:image_size"
}
# Converts pose,hands landmarks to a render data vector.
node {
calculator: "SlrTrackingToRenderData"
input_stream: "IMAGE_SIZE:image_size"
input_stream: "POSE_LANDMARKS:pose_landmarks"
input_stream: "POSE_ROI:pose_roi"
input_stream: "LEFT_HAND_LANDMARKS:left_hand_landmarks"
input_stream: "RIGHT_HAND_LANDMARKS:right_hand_landmarks"
output_stream: "RENDER_DATA_VECTOR:render_data_vector"
}
# Draws annotations and overlays them on top of the input images.
node {
calculator: "AnnotationOverlayCalculator"
input_stream: "IMAGE_GPU:throttled_input_video"
input_stream: "VECTOR:render_data_vector"
output_stream: "IMAGE_GPU:output_video_pre"
}
node {
calculator: "SlrDetectionGpu"
input_stream: "IMAGE:output_video_pre"
output_stream: "MEANINGSIGNAL:slr_meaning_render_data"
}
# Draws annotations and overlays them on top of the input images.
node {
calculator: "AnnotationOverlayCalculator"
input_stream: "IMAGE_GPU:throttled_input_video"
input_stream: "slr_meaning_render_data"
input_stream: "VECTOR:render_data_vector"
output_stream: "IMAGE_GPU:output_video"
}
此代码允许我捕获单个帧,但我无法将其更改为捕获多个帧。我想将其更改为每秒捕获 6 帧 (30fps / 5),然后将它们分组以将它们发送回学习模型进行识别。我怎么能做出这种改变?我已经尝试过,但无法将其更改为捕获多个帧,并且没有关于如何执行此操作的想法。欢迎提供任何帮助,不胜感激。
解决方法
暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!
如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。
小编邮箱:dio#foxmail.com (将#修改为@)