pytesseract image_to_string函数根本不准确

问题描述

我的代码

for index,img in enumerate(data): # data is list of base64 decoded strings
    b64 = base64.b64decode(bytes(img[22:],encoding='utf-8'))
    raw = BytesIO(b64)
    im = Image.open(raw).convert('LA')
    pixels = im.load()
    width,height = im.size
    for x in range(width):
        for y in range(height):
            if pixels[x,y][0] > 100: pixels[x,y] = (255,255)
            else: pixels[x,y] = (0,255)
    print(pytesseract.image_to_string(im,config='tessedit_char_whitelist=1234567890plus?'))

我的图片:

enter image description here

输出:
Te Ys
我可以做些什么来使它更好,我尝试在配置中使用从0到13的每个psm和-c标志

解决方法

此代码对我来说很好,但未检测到空格。

    img = ~cv2.imread("18.png",0)
    rows,cols = img.shape[:2]
    # M = np.float32([[1,25],[0,1,15]])
    # img = cv2.warpAffine(img,M,(cols*2,rows*2),borderValue=(255,255,255))
    custom_oem_psm_config = r'--oem 3 --psm 3 -c tessedit_char_whitelist="1234567890plus?"'# -c preserve_interword_spaces=1'
    print(pytesseract.image_to_string(img,config=custom_oem_psm_config))

输出:

18plus16?

相关问答

错误1:Request method ‘DELETE‘ not supported 错误还原:...
错误1:启动docker镜像时报错:Error response from daemon:...
错误1:private field ‘xxx‘ is never assigned 按Alt...
报错如下,通过源不能下载,最后警告pip需升级版本 Requirem...