问题描述
net = cv2.dnn.readNetFromCaffe(prototxt,model)
detections = net.forward()
检测将具有形状为(1、1、200、7)的4D阵列。有什么不同的值?
for i in range(0,detections.shape[2]):
confidence = detections[0,i,2]
上面的循环在第3维获得置信度值,条件是行数和第4维是列数,因此很明显,检测到的对象具有如此高的置信度或概率。但是我不明白其他参数。
box = detections[0,3:7] * np.array([w,h,w,h])
要创建一个框,上面的代码用于3到6列及其值。那么第2列到第6列中的这些值是什么?
下面的代码可重现值...
from imutils.video import VideoStream
import imutils
import numpy as np
import cv2
import argparse
import time
# construct the argument parser
ap = argparse.ArgumentParser()
ap.add_argument("-p","--prototxt",required=True,help="path to caffe deploy prototxt file")
ap.add_argument("-m","--model",help="path to pre-trained caffe model")
ap.add_argument("-c","--confidence",type=float,default=0.5,help="minimum confidence required")
args = vars(ap.parse_args())
# MODEL - load the model which will be used to predict
print("[INFO] loading model...")
net = cv2.dnn.readNetFromCaffe(args["prototxt"],args["model"])
# INPUT - start video to capture frames
print("[INFO] starting video stream...")
vs = VideoStream(src=0).start()
# vs = VideoStream(usePiCamera=True).start() # this is to stream video for Raspberry Pi camera
# vs = FileVideoStream(path='/path to file') # this is to get video content from file
time.sleep(2) # allow the cam to warm up
# for live streaming,while loop will be required to capture frames
while True:
frame = vs.read()
frame = imutils.resize(frame,width=400)
# get the dimensions of the frame
print('shape of frame:',frame.shape)
(h,w) = frame.shape[:2]
# blob the frame
blob = cv2.dnn.blobFromImage(frame,scalefactor=1.0,size=(300,300),mean=(104.0,177.0,123.0))
# use the blob for detection
net.setInput(blob)
detections = net.forward()
print(detections)
# loop over the detections
for i in range(0,2]
if confidence < args['confidence']:
continue
# create box
box = detections[0,h])
(startX,startY,endX,endY) = box.astype("int")
# draw the box on the face
text = "{:.2f}%".format(confidence * 100)
y = startY - 10 if startY - 10 > 10 else startY + 10
cv2.rectangle(frame,(startX,startY),(endX,endY),(0,255,0),thickness=2)
cv2.putText(frame,text,y),cv2.FONT_HERSHEY_SIMPLEX,0.45,255),2)
# show on screen
cv2.imshow("frame",frame)
if cv2.waitKey(1) & 0xFF == 27:
break
cv2.destroyAllWindows()
vs.stop()
解决方法
第一列代表它所属的类id 第二列代表置信度值,通过与置信度阈值进行比较,帧中是否有任何对象 [3:7] 即 3,4,5,6 列表示在对象周围创建框的坐标/点,它们存储为 numpy 数组,因此要转换为 int 类型以制作框。