从多个图像中提取文本

编程问答 2022-05-16

问题描述

我想从多张图片中提取文本。
我想在colab中做。
我知道如何用一张图片做到这一点：https://github.com/bhadreshpsavani/ExploringOCR/blob/master/OCRusingTesseract.ipynb
但是怎么做一个循环呢，因为我有一百多张图？
提前致谢！

解决方法

我将图片上传到根目录的 colab.research 中，并使用以下代码解决了此任务：

image_ext = ['.jpg','.png','.jpeg']
directory = '/'
for file in os.listdir(directory):
  ext = os.path.splitext(file)[-1].lower()
  if ext not in image_ext:
    continue
  filename = os.path.join(directory,file)
  
  extracted_information = pytesseract.image_to_string(Image.open(filename))
  print(extracted_information)

cycle text-extraction

相关问答

matplotlib报错：AttributeError: module 'backend_interagg' has no attribute 'FigureCanvas'. Did you mean: 'FigureCanvasAgg'?

使用本地python环境可以成功执行 import pandas as pd impor...

gitlab登录失败，报错：This challenge page was accidentally cached by an intermediary and is no longer available.

设置时间控制面板

后端开发常见错误

错误1：Request method ‘DELETE‘ not supported 错误还原：...

docker常见错误

错误1：启动docker镜像时报错：Error response from daemon:...

idea常见错误

错误1：private field ‘xxx‘ is never assigned 按Alt...

pip安装依赖失败

报错如下，通过源不能下载，最后警告pip需升级版本 Requirem...