python pdf2image "May not be a PDF file"错误

在 Centos 8 操作系统上，使用 Python 将 pdf 页面转换为 jpg 文件时出现错误。

from pdf2image import convert_from_path
import sys

images = convert_from_path("test.pdf",500)
for i in range(len(images)):
    images[i].save('page'+ str(i) +'.jpg', 'JPEG')

结果，它给出了这个错误。我可以在本地运行 PDF 文件，但是当我想将它保存为 jpg 时它不起作用。

Traceback (most recent call last):
  File "/usr/local/lib/python3.6/site-packages/pdf2image/pdf2image.py", line 479, in pdfinfo_from_path
    raise ValueError
ValueError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "pdf_conv.py", line 7, in <module>
    images = convert_from_path(pdf_path,500)
  File "/usr/local/lib/python3.6/site-packages/pdf2image/pdf2image.py", line 98, in convert_from_path
    page_count = pdfinfo_from_path(pdf_path, userpw, poppler_path=poppler_path)["Pages"]
  File "/usr/local/lib/python3.6/site-packages/pdf2image/pdf2image.py", line 489, in pdfinfo_from_path
    "Unable to get page count.\n%s" % err.decode("utf8", "ignore")
pdf2image.exceptions.PDFPageCountError: Unable to get page count.
Syntax Warning: May not be a PDF file (continuing anyway)
Syntax Error: Couldn't find trailer dictionary
Syntax Error: Couldn't find trailer dictionary
Syntax Error: Couldn't read xref table

最佳答案

PDF != PDF - 它有不同的版本。愿你的 python pdf2image不喜欢/不知道您提供的 PDF 类型。使用 AcrobatReader 或类似工具检查您要转换的内容，看看是否 pdf2image支持它。
见 Which ISO standards does pdf2image support (简称:pdf2image supports all PDF standards that poppler supports.)

关于python pdf2image "May not be a PDF file"错误，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/71814599/

python pdf2image "May not be a PDF file"错误

上一篇：python - 剂量 python 报告实验室有调试日志吗？

下一篇：database - systemctl 启动 postgresql-13 启动失败