python - 如何查找特定文本并打印其后的接下来的 2 个单词

我的代码如下。

我目前有一个 if 语句可以查找特定单词，在本例中为“INGREDIENTS”。

接下来，我需要打印“INGREDIENTS”中接下来的 2 个单词/字符串，而不是 print("true")。该单词/字符串在图像中出现一次(“成分”)。

作为示例，我运行 .py 文件，如果我将其包含在脚本中，这就是我的输出:print(text)

Ground Almonds

INGREDIENTS: Ground Almonds(100%).

1kg

我只需要重新编码此部分:

if 'INGREDIENTS' in text:
 print("True")
else:
 print("False")

所以输出是这样的:

INGREDIENTS: Ground Almonds

因为接下来的两个单词/字符串是 Ground 和 Almonds

Python代码

from PIL import Image
import pytesseract

pytesseract.pytesseract.tesseract_cmd = r'C:\Users\gzi\AppData\Roaming\Python\Python37\site-packages\tesseract.exe'

img=Image.open('C:/Users/gzi/Desktop/work/lux.jpg')

text = pytesseract.image_to_string(img, lang = 'eng')


if 'INGREDIENTS' in text:
 print("True")
else:
 print("False")

最佳答案

如果您不关心百分比并希望避免使用正则表达式:

string = 'INGREDIENTS: Ground Almonds(100%).'

tokens = string.split()
for n,i in enumerate(tokens):
    if 'INGREDIENTS' in i:
        print(' '.join(tokens[n:n+3]))

输出:

INGREDIENTS: Ground Almonds(100%).

关于python - 如何查找特定文本并打印其后的接下来的 2 个单词，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/57563221/

上一篇：python - 如何从 pandas 数据框中获取 'create' 脚本？

下一篇：python - 从tensorflow 1.x升级到2.0

相关文章：

opencv - 如何在 py-opencv 中保存 dpi 信息？

c++ - 如何定义 tesseract 用于识别(而不是训练)的字体类型？

c++ - 重置 Tesseract-OCR 变量

python - GAE NDB AttributeError 模型实例没有属性

python - 在python中遍历列表的最有效方法是什么？

python - 音频组的成员无法访问/dev/dsp 来播放声音

python - 重新采样 Pandas 列日期时间

python - 如何在 python 或命令窗口中获取 Tesseract 置信度？

java - 如何将 Tess4j 与 IntelliJ 一起使用？

python - 如何在 tkinter 列表框中突出显示选择？