Python-OpenCV中的头高程如何工作?

问题描述

我主要按照此指南来估算单个图像的头部姿势: https://towardsdatascience.com/real-time-head-pose-estimation-in-python-e52db1bc606a

人脸检测效果很好-如果我绘制图像和检测到的地标,它们可以很好地对齐。

我正在从图像中估计相机矩阵,并且假设没有镜头失真:

    size = image.shape
    focal_length = size[1]
    center = (size[1]/2,size[0]/2)
    camera_matrix = np.array([[focal_length,center[0]],[0,focal_length,center[1]],1]],dtype="double")
    dist_coeffs = np.zeros((4,1))  # Assuming no lens distortion

我正在尝试通过使用solvePNP将图像中的点与3D模型中的点进行匹配来获得头部姿势:

    # 3D-model points to which the points extracted from an image are matched:
    model_points = np.array([
                                (0.0,0.0,0.0),# Nose tip
                                (0.0,-330.0,-65.0),# Chin
                                (-225.0,170.0,-135.0),# Left eye corner
                                (225.0,# Right eye corner
                                (-150.0,-150.0,-125.0),# Left Mouth corner
                                (150.0,-125.0)      # Right mouth corner
                            ])
    
    image_points = np.array([
                            shape[30],# Nose tip
                            shape[8],# Chin
                            shape[36],# Left eye left corner
                            shape[45],# Right eye right corne
                            shape[48],# Left Mouth corner
                            shape[54]      # Right mouth corner
                            ],dtype="double")
    
    success,rotation_vec,translation_vec) = \
            cv2.solvePnP(model_points,image_points,camera_matrix,dist_coeffs)

最后,我从旋转得到欧拉角:

rotation_mat,_ = cv2.Rodrigues(rotation_vec)
pose_mat = cv2.hconcat((rotation_mat,translation_vec))
_,_,angles = cv2.decomposeProjectionMatrix(pose_mat)

现在方位是我期望的-如果我向左看,则为负,中间为零,向右为正。

但是海拔很奇怪-如果我在中间看,它的值是一个常数,但符号是随机的-图像与图像之间会发生变化(该值大约为170)

当我向上看时,符号为正,而我向上看时,值减小, 当我向下看时,符号为负,而向下看时,值减小。

有人可以向我解释这个输出吗?

解决方法

好吧,看来我已经找到了解决方案-模型点(我在有关该主题的几个博客中发现)似乎是错误的。该代码似乎适用于模型点和图像点的这种组合(不知道为什么要反复试验):

model_points = np.float32([[6.825897,6.760612,4.402142],[1.330353,7.122144,6.903745],[-1.330353,[-6.825897,[5.311432,5.485328,3.987654],[1.789930,5.393625,4.413414],[-1.789930,[-5.311432,[2.005628,1.409845,6.165652],[-2.005628,[2.774015,-2.080775,5.048531],[-2.774015,[0.000000,-3.116408,6.097667],-7.415691,4.070434]])

image_points = np.float32([shape[17],shape[21],shape[22],shape[26],shape[36],shape[39],shape[42],shape[45],shape[31],shape[35],shape[48],shape[54],shape[57],shape[8]])