python - pytesseract 和 image.tif 文件

我需要使用 pytesseract 将包含多个页面的 image.tif 转录为文本。我有下一个代码:

> From PIL import Image
> Import pytesseract
> Pytesseract.pytesseract.tesseract_cmd = 'C: / Program Files (x86) / Tesseract-
> OCR / tesseract '
> Print (pytesseract.image_to_string (Image.open ('CAMARA.tif'), lang = "spa"))

问题是只提取第一页。我怎样才能提取所有这些？

最佳答案

我能够通过如下调用方法 convert() 来解决同样的问题

image = Image.open(imagePath).convert("RGBA")
text = pytesseract.image_to_string(image)
print(text)

关于python - pytesseract 和 image.tif 文件，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/45292287/

上一篇：python正则表达式匹配并替换字符串的开头和结尾但保留中间

下一篇：javascript - 如何让Python成为客户端？

相关文章：

python - Scrapy无法通过ID选择

python - 如何使用正则表达式获取字符串及其值

python - 图像到文本python

python - 为什么 python 将 0.2 + 0.2 显示为 0.4？

python - 提取图像的 k 均值聚类的特定成员

python - 无法使用 pytesseract.image_to_string 从图像中读取文本

python - Pytesseract OCR 无法识别数字

python - TesseractNotFoundError : tesseract is not installed or it's not in your path

python - 在 Pytesser 中使用多种语言

python 进口: ModuleNotFoundError: No module named 'pytesser'