例如,如何从这样的pdfminer脚本的PDF档案中提取第5行?

问题描述

from pdfminer3.layout import LAParams,LTTextBox
from pdfminer3.pdfpage import pdfpage
from pdfminer3.pdfinterp import PDFResourceManager
from pdfminer3.pdfinterp import pdfpageInterpreter
from pdfminer3.converter import pdfpageAggregator
from pdfminer3.converter import TextConverter
import io


resource_manager = PDFResourceManager()
fake_file_handle = io.StringIO()
converter = TextConverter(resource_manager,fake_file_handle,laparams=LAParams())
page_interpreter = pdfpageInterpreter(resource_manager,converter)

with open('ARQUIVO.pdf','rb') as fh:

    for page in pdfpage.get_pages(fh,caching=True,check_extractable=True):
        page_interpreter.process_page(page)

    text = fake_file_handle.getvalue()

converter.close()
fake_file_handle.close()

print(text)

解决方法

暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!

如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@)