python - 无法使用 PyPDF2 打开 PDF 文件

标签 python python-3.x pdf

我正在使用 Python 3.8.5。我正在尝试编写一个简短的脚本来连接 PDF 文件并从 this Stack Overflow question 中学习，我正在尝试使用 PyPDF2 .不幸的是，我似乎无法创建 PyPDF2.PdfFileReader实例而不会崩溃。
我的代码如下所示:

import pathlib
import PyPDF2

pdf_path = pathlib.Path('1.pdf')
with pdf_path.open('rb') as pdf_file:
    reader = PyPDF2.PdfFileReader(pdf_file, strict=False)

当我尝试运行它时，我得到以下回溯:

Traceback (most recent call last):
  File "C:\...\pdf\open_pdf.py", line 6, in <module>
    reader = PyPDF2.PdfFileReader(pdf_file, strict=False)
  File "C:\...\.virtualenvs\pdf-j0HnXL2B\lib\site-packages\PyPDF2\pdf.py", line 1084, in __init__
    self.read(stream)
  File "C:\...\.virtualenvs\pdf-j0HnXL2B\lib\site-packages\PyPDF2\pdf.py", line 1883, in read
    stream.seek(-11, 1)
OSError: [Errno 22] Invalid argument

为了帮助重现问题，我创建了 this GitHub repo使用上述代码和示例 PDF 文件。
我究竟做错了什么？

最佳答案

好像是你的 1.pdf文件验证失败，在此处检查:https://www.pdf-online.com/osa/validate.aspx
我尝试使用 1.7 版的另一个 pdf 文件并且它有效，所以它不是关于 pdf 版本，你只是有一个坏的 1.pdf 文件

关于python - 无法使用 PyPDF2 打开 PDF 文件，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/64078614/

上一篇：amazon-cloudwatch - 在AWS Log Insights图中将空垃圾箱显示为零值

下一篇：c# - 如何通过添加一堆字节在缓冲区中移动 SequencePosition "to the right"？

相关文章：

python-3.x - 进行 FFT 的最快方法

python - 如何压缩无限范围的迭代器？

c++ - PoDoFo 从 pdf 中提取文本 + 坐标

pdf - 如何将PDF中的所有颜色更改为它们各自的免费颜色；如何使PDF阴性

Java gzip pdf 从 url 到文件 - 结果出现轻微字符不匹配

python - IDNA 不往返

python - pytest - 获取fixture参数的值

python - 如何使用 Python 从链表中删除给定节点

python - Django - South - 有没有办法查看它运行的 SQL？

python-3.x - 除非点后的字符是数字，否则如何在点后拆分字符串