问题描述
我正在尝试在一组相似但大小可能不同的图像上运行OCR。由于某种原因,我无法获得可预测的结果。我能做些什么以获得更好的结果。
带有或不带有cv2预处理的Tesseract在某些图像上可以很好地工作,而在某些图像上则可以失败,并且没有图案。图像或多或少相似。 Upper image represents processed image
def filter_img(img):
# Read pil image as cv2
img = np.array(img)
img = cv2.resize(img,None,fx=2,fy=2,interpolation=cv2.INTER_CUBIC)
# Converting image to grayscale (important for applying threshold)
img = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
#Apply dilation and erosion to remove some noise
kernel = np.ones((1,1),np.uint8)
# img = cv2.dilate(img,kernel,iterations=1)
img = cv2.erode(img,iterations=1)
# Apply blur to smooth out the edges
img = cv2.GaussianBlur(img,(5,5),0)
# img = cv.medianBlur(img,5)
# Apply threshold to get image with only b&w (binarization)
img = cv2.threshold(img,255,cv2.THRESH_BINARY + cv2.THRESH_OTSU)[1]
img = Image.fromarray(img)
img = ImageOps.expand(img,border=2,fill='black')
visualize.show_labeled_image(img,boxes)
return img
# Applying Tesseract OCR
def run_tesseract(img):
# Tesseract cmd setup
# pytesseract.pytesseract.tesseract_cmd = "tesseract"
whitelist = string.ascii_uppercase + string.digits + ".-"
parameters = '-c load_freq_dawg=0 -c tessedit_char_whitelist="{}"'.format(whitelist)
psm = 8
custom_oem_psm_config = "--dpi 300 --oem 3 --psm {psm} {parameters}".format(parameters=parameters,psm=psm)
try:
text = pytesseract.image_to_string(img,config=custom_oem_psm_config,timeout=2)
return text.strip()
except RuntimeError:
print ("TIMEOUT")
return ""
解决方法
如果图像格式高度一致,则可以考虑使用分割图像。并且在对图像进行ocr处理后,对容易出错的区域(例如0和O)造成混淆的情况下,请对第一个字母或数字使用条件判断。当然,以上所有条件只有在图像高度一致的情况下才有效。
enter code here
import cv2
import numpy as np
import pytesseract
import matplotlib.pyplot as plt
pytesseract.pytesseract.tesseract_cmd = 'D://Program Files/Tesseract-
OCR/tesseract.exe'
img = cv2.imread('vATKQ.png')
img2 = img[100:250,180:650] #split to region you want
plt.imshow(img2)
text=pytesseract.image_to_string(img2)
print(text)