在python上此验证码中将图像转换为文本

问题描述

我一直在尝试使用PyTesseract库将图像转换为文本。我设法捕获了验证码并尝试将其翻译。它不能很好地翻译文本并使验证码失败，我不知道自己在做错什么，我搜索了很多网站，但找不到特定的：s。我真的需要帮助

这是我在python上的代码：

@H_502_4@

from PIL import Image,ImageFilter
import PyTesseract as pt
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.wait import webdriverwait
from selenium.webdriver.support import expected_conditions as EC

rif = 'J400607396'
browser = webdriver.Chrome('C:\\Users\\USUARIO\\Desktop\\chromedriver.exe')  -> driver on chrome
browser.get('http://contribuyente.seniat.gob.ve/BuscaRif/BuscaRif.jsp') -> page
# completar el rif y presionar el siguiente botón 
p_rif = browser.find_element_by_id('p_rif') -> input p rif
p_rif.send_keys(rif) 

# Imagen captcha

screenshot_name = "captcha.png"
browser.save_screenshot(screenshot_name)

img = Image.open("captcha.png")
area = (143,205,263,235)
cropped_img = img.crop(area)
cropped_img.save('captcha.png')

img = Image.open("captcha.png")

captcha = pt.image_to_string(img,config='--psm 10 -c tessedit_char_whitelist=0123456789abcdefghijklmnopkrstuvwxyz') -> this is the problem
captcha = captcha.replace(" ","").strip()
print(captcha)
codigo = browser.find_element_by_id('codigo')
codigo.send_keys(captcha)
nextButton = browser.find_element_by_name('busca') 
nextButton.click()
browser.close()

我需要将图像转换为文字

我将附上一个验证码示例，验证码每次更新都会更改

解决方法

暂无找到可以解决该程序问题的有效方法，小编努力寻找整理中！

如果你已经找到好的解决方法，欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@）

python python-tesseract