问题描述
我希望有人可以帮助我提供一些想法或指导我使用 Mediapipe 使用 Iris .aar 创建自定义 Android 应用程序的一些进一步阅读材料,我已经倾注了官方 MediaPipe 文档,但找到了它有点受限制,现在我正在努力取得进展。我一直在尝试为 Iris 模型添加预期的 Side Packet 并尝试实时提取特定的地标坐标。
我的目标是创建一个开源注视方向驱动的文本到语音键盘,用于辅助功能,使用修改后的 MediaPipe Iris 解决方案来推断用户的注视方向以控制应用程序,我将非常感谢您对此的任何帮助.
这是我目前的发展计划和进展:
- 从命令行设置 Mediapipe 并构建示例完成
- 生成用于面部检测和虹膜跟踪的 .aars 完成
- 设置 Android Studio 以构建 Mediapipe 应用完成
- 使用 .aar 构建和测试人脸检测示例应用完成
- 修改人脸检测示例以使用 Iris .aar IN PROGRESS
- 输出虹膜和眼睛边缘之间的坐标以及它们之间的距离,以实时估计方向。或者修改图表和计算器以在可能的情况下为我推断并重建 .aar
- 将注视方向集成到应用中的控制方案中。
- 在实施初始控制后扩展应用功能。
到目前为止,我已经使用以下构建文件生成了 Iris .aar, 我构建的 .aar 是否包含子图和主图的计算器,还是我需要在我的 AAR 构建文件中添加其他内容?
.aar 构建文件:
load("//mediapipe/java/com/google/mediapipe:mediapipe_aar.bzl","mediapipe_aar")
mediapipe_aar(
name = "mp_iris_tracking_aar",calculators = ["//mediapipe/graphs/iris_tracking :iris_tracking_gpu_deps"],)
目前我有一个 android studio 项目,其中包含以下资产和前面提到的 Iris .aar。
Android Studio Assets:
iris_tracking_gpu.binarypb
face_landmark.tflite
iris_landmark.tflite
face_detection_front.tflite
现在,我只是尝试按原样构建它,以便我更好地了解该过程并可以验证我的构建环境设置是否正确。我已经成功构建并测试了文档中列出的人脸检测示例,这些示例可以正确运行,但是在修改项目以利用 iris .aar 时,它可以正确构建但在运行时崩溃,但例外情况是:需要侧包“focal_length_pixel”但未提供。
我尝试根据媒体管道代表中的 Iris 示例将焦距代码添加到 onCreate,但我不知道如何修改它以使用 Iris .aar,是否还有其他文档我可以通过阅读为我指明正确的方向吗?
我需要将此代码段(我认为)集成到人脸检测示例的修改代码中,但不确定如何集成。谢谢你的帮助:)
float focalLength = cameraHelper.getFocalLengthPixels();
if (focalLength != Float.MIN_VALUE) {
Packet focalLengthSidePacket = processor.getPacketCreator().createFloat32(focalLength);
Map<String,Packet> inputSidePackets = new HashMap<>();
inputSidePackets.put(FOCAL_LENGTH_STREAM_NAME,focalLengthSidePacket);
processor.setInputSidePackets(inputSidePackets);
}
haveAddedSidePackets = true;
Modified Face Tracking AAR example:
package com.example.iristracking;
// copyright 2019 The MediaPipe Authors.
//
// Licensed under the Apache License,Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing,software
// distributed under the License is distributed on an "AS IS" BASIS,// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND,either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
import android.graphics.SurfaceTexture;
import android.os.Bundle;
import android.util.Log;
import java.util.HashMap;
import java.util.Map;
import androidx.appcompat.app.AppCompatActivity;
import android.util.Size;
import android.view.SurfaceHolder;
import android.view.SurfaceView;
import android.view.View;
import android.view.ViewGroup;
import com.google.mediapipe.components.CameraHelper;
import com.google.mediapipe.components.CameraXPreviewHelper;
import com.google.mediapipe.components.ExternalTextureConverter;
import com.google.mediapipe.components.FrameProcessor;
import com.google.mediapipe.components.PermissionHelper;
import com.google.mediapipe.framework.AndroidAssetUtil;
import com.google.mediapipe.framework.Packet;
import com.google.mediapipe.glutil.EglManager;
/** Main activity of MediaPipe example apps. */
public class MainActivity extends AppCompatActivity {
private static final String TAG = "MainActivity";
private boolean haveAddedSidePackets = false;
private static final String FOCAL_LENGTH_STREAM_NAME = "focal_length_pixel";
private static final String OUTPUT_LANDMARKS_STREAM_NAME = "face_landmarks_with_iris";
private static final String BINARY_GRAPH_NAME = "iris_tracking_gpu.binarypb";
private static final String INPUT_VIDEO_STREAM_NAME = "input_video";
private static final String OUTPUT_VIDEO_STREAM_NAME = "output_video";
private static final CameraHelper.CameraFacing CAMERA_FACING = CameraHelper.CameraFacing.FRONT;
// Flips the camera-preview frames vertically before sending them into FrameProcessor to be
// processed in a MediaPipe graph,and flips the processed frames back when they are displayed.
// This is needed because OpenGL represents images assuming the image origin is at the bottom-left
// corner,whereas MediaPipe in general assumes the image origin is at top-left.
private static final boolean FLIP_FRAMES_VERTICALLY = true;
static {
// Load all native libraries needed by the app.
System.loadLibrary("mediapipe_jni");
System.loadLibrary("opencv_java3");
}
// {@link SurfaceTexture} where the camera-preview frames can be accessed.
private SurfaceTexture previewFrameTexture;
// {@link SurfaceView} that displays the camera-preview frames processed by a MediaPipe graph.
private SurfaceView previewdisplayView;
// Creates and manages an {@link EGLContext}.
private EglManager eglManager;
// Sends camera-preview frames into a MediaPipe graph for processing,and displays the processed
// frames onto a {@link Surface}.
private FrameProcessor processor;
// Converts the GL_TEXTURE_EXTERNAL_OES texture from Android camera into a regular texture to be
// consumed by {@link FrameProcessor} and the underlying MediaPipe graph.
private ExternalTextureConverter converter;
// Handles camera access via the {@link CameraX} Jetpack support library.
private CameraXPreviewHelper cameraHelper;
@Override
protected void onCreate(Bundle savedInstanceState) {
super.onCreate(savedInstanceState);
setContentView(R.layout.activity_main);
previewdisplayView = new SurfaceView(this);
setupPreviewdisplayView();
// Initialize asset manager so that MediaPipe native libraries can access the app assets,e.g.,// binary graphs.
AndroidAssetUtil.initializeNativeAssetManager(this);
eglManager = new EglManager(null);
processor =
new FrameProcessor(
this,eglManager.getNativeContext(),BINARY_GRAPH_NAME,INPUT_VIDEO_STREAM_NAME,OUTPUT_VIDEO_STREAM_NAME);
processor.getVideoSurfaceOutput().setFlipY(FLIP_FRAMES_VERTICALLY);
PermissionHelper.checkAndRequestCameraPermissions(this);
}
@Override
protected void onResume() {
super.onResume();
converter = new ExternalTextureConverter(eglManager.getContext());
converter.setFlipY(FLIP_FRAMES_VERTICALLY);
converter.setConsumer(processor);
if (PermissionHelper.cameraPermissionsGranted(this)) {
startCamera();
}
}
@Override
protected void onPause() {
super.onPause();
converter.close();
}
@Override
public void onRequestPermissionsResult(
int requestCode,String[] permissions,int[] grantResults) {
super.onRequestPermissionsResult(requestCode,permissions,grantResults);
PermissionHelper.onRequestPermissionsResult(requestCode,grantResults);
}
private void setupPreviewdisplayView() {
previewdisplayView.setVisibility(View.GONE);
ViewGroup viewGroup = findViewById(R.id.preview_display_layout);
viewGroup.addView(previewdisplayView);
previewdisplayView
.getHolder()
.addCallback(
new SurfaceHolder.Callback() {
@Override
public void surfaceCreated(SurfaceHolder holder) {
processor.getVideoSurfaceOutput().setSurface(holder.getSurface());
}
@Override
public void surfaceChanged(SurfaceHolder holder,int format,int width,int height) {
// (Re-)Compute the ideal size of the camera-preview display (the area that the
// camera-preview frames get rendered onto,potentially with scaling and rotation)
// based on the size of the SurfaceView that contains the display.
Size viewSize = new Size(width,height);
Size displaySize = cameraHelper.computedisplaySizefromViewSize(viewSize);
// Connect the converter to the camera-preview frames as its input (via
// previewFrameTexture),and configure the output width and height as the computed
// display size.
converter.setSurfaceTextureAndAttachToGLContext(
previewFrameTexture,displaySize.getWidth(),displaySize.getHeight());
}
@Override
public void surfaceDestroyed(SurfaceHolder holder) {
processor.getVideoSurfaceOutput().setSurface(null);
}
});
}
private void startCamera() {
cameraHelper = new CameraXPreviewHelper();
cameraHelper.setonCameraStartedListener(
surfaceTexture -> {
previewFrameTexture = surfaceTexture;
// Make the display view visible to start showing the preview. This triggers the
// SurfaceHolder.Callback added to (the holder of) previewdisplayView.
previewdisplayView.setVisibility(View.VISIBLE);
});
cameraHelper.startCamera(this,CAMERA_FACING,/*surfaceTexture=*/ null);
}
}
解决方法
override fun onResume() {
super.onResume()
converter = ExternalTextureConverter(eglManager?.context,NUM_BUFFERS)
if (PermissionHelper.cameraPermissionsGranted(this)) {
var rotation: Int = 0
if (Build.VERSION.SDK_INT >= Build.VERSION_CODES.R) {
rotation = this.display!!.rotation
} else {
rotation = this.windowManager.defaultDisplay.rotation
}
converter!!.setRotation(rotation)
converter!!.setFlipY(FLIP_FRAMES_VERTICALLY)
startCamera(rotation)
if (!haveAddedSidePackets) {
val packetCreator = mediapipeFrameProcessor!!.getPacketCreator();
val inputSidePackets = mutableMapOf<String,Packet>()
focalLength = cameraHelper?.focalLengthPixels!!
Log.i(TAG_MAIN,"OnStarted focalLength: ${cameraHelper?.focalLengthPixels!!}")
inputSidePackets.put(
FOCAL_LENGTH_STREAM_NAME,packetCreator.createFloat32(focalLength.width.toFloat())
)
mediapipeFrameProcessor!!.setInputSidePackets(inputSidePackets)
haveAddedSidePackets = true
val imageSize = cameraHelper!!.imageSize
val calibrateMatrix = Matrix()
calibrateMatrix.setValues(
floatArrayOf(
focalLength.width * 1.0f,0.0f,imageSize.width / 2.0f,focalLength.height * 1.0f,imageSize.height / 2.0f,1.0f
)
)
val isInvert = calibrateMatrix.invert(matrixPixels2World)
if (!isInvert) {
matrixPixels2World = Matrix()
}
}
converter!!.setConsumer(mediapipeFrameProcessor)
}
}`