Textract:失败,退出代码127 // Windows 10 // pdftotext

问题描述

当我尝试运行我的程序(使用pyinstaller部署后)时,用于读取和转换PDF文件并将其输入到Google工作表中。我收到下图中显示的错误。但是我似乎无法弄清楚问题出在哪里:

Exception in Tkinter callback
Traceback (most recent call last):
  File "C:\Users\trpfinance\AppData\Local\Programs\Python\Python38-32\lib\site-packages\textract\parsers\utils.py",line 82,in run
    pipe = subprocess.Popen(
  File "C:\Users\trpfinance\AppData\Local\Programs\Python\Python38-32\lib\subprocess.py",line 854,in __init__
    self._execute_child(args,executable,preexec_fn,close_fds,File "C:\Users\trpfinance\AppData\Local\Programs\Python\Python38-32\lib\subprocess.py",line 1307,in _execute_child
    hp,ht,pid,tid = _winapi.CreateProcess(executable,args,FileNotFoundError: [WinError 2] The system cannot find the file specified

During handling of the above exception,another exception occurred:

Traceback (most recent call last):
  File "C:\Users\trpfinance\AppData\Local\Programs\Python\Python38-32\lib\tkinter\__init__.py",line 1883,in __call__
    return self.func(*args)
  File "EinkaufRGWindows.py",line 40,in InkoopRekeningen
    text = textract.process(str(importfolder) + str(i))
  File "C:\Users\trpfinance\AppData\Local\Programs\Python\Python38-32\lib\site-packages\textract\parsers\__init__.py",line 77,in process
    return parser.process(filename,encoding,**kwargs)
  File "C:\Users\trpfinance\AppData\Local\Programs\Python\Python38-32\lib\site-packages\textract\parsers\utils.py",line 46,in process
    byte_string = self.extract(filename,**kwargs)
  File "C:\Users\trpfinance\AppData\Local\Programs\Python\Python38-32\lib\site-packages\textract\parsers\pdf_parser.py",line 28,in extract
    raise ex
  File "C:\Users\trpfinance\AppData\Local\Programs\Python\Python38-32\lib\site-packages\textract\parsers\pdf_parser.py",line 20,in extract
    return self.extract_pdftotext(filename,line 43,in extract_pdftotext
    stdout,_ = self.run(args)
  File "C:\Users\trpfinance\AppData\Local\Programs\Python\Python38-32\lib\site-packages\textract\parsers\utils.py",line 90,in run
    raise exceptions.ShellError(
textract.exceptions.ShellError: The command `pdftotext //Mac/Home/Desktop/Wickey Einkauf Test/Rekeningen/Lekkerkerker_ - 20803471.pdf -` failed with exit code 127
------------- stdout -------------
------------- stderr -------------

enter image description here

解决方法

您似乎得到了FileNotFoundError。如果您看到此错误,则正在运行的命令是:

pdftotext //Mac/Home/Desktop/Wickey Einkauf Test/Rekeningen/Lekkerkerker_ - 
 0803471.pdf -

在这里我要看几件事。首先,在文件路径的开头有一个额外的斜杠,这似乎是错误的。其次,文件路径中有空格,但是路径中没有引号。第二部分意味着pdftotext将把它读为几个单独的命令参数,而不是一个。您可以通过格式化子流程调用的格式来解决此问题,以使文件用引号引起来,如下所示:

pdftotext "example file path.pdf" -
,

您需要使用pip安装pdftotext。 要安装它,您需要具有Microsoft Visual C ++ 14或更高版本。

相关问答

依赖报错 idea导入项目后依赖报错,解决方案:https://blog....
错误1:代码生成器依赖和mybatis依赖冲突 启动项目时报错如下...
错误1:gradle项目控制台输出为乱码 # 解决方案:https://bl...
错误还原:在查询的过程中,传入的workType为0时,该条件不起...
报错如下,gcc版本太低 ^ server.c:5346:31: 错误:‘struct...