如果 word_B 和 word_C 彼此在 N 个单词内，并且 word_A 不在前面的 M 个单词中，则正则表达式匹配

仅当另一个给定单词不在前面时(意味着中间可能存在其他单词)，才会查找彼此靠近的 2 个单词。我们可以假设，如果单词 A 出现，它将出现在单词 B 之前，并且单词 B 始终会出现在单词 C 之前。

例如，N=3，M=2:

word_A字word_B字word_C ||不应该匹配

word_A字字word_B字||不应该匹配

word_A字字字字word_B字字word_C ||应该匹配

字word_B字word_C ||应该匹配

到目前为止，我想出了这个:但不起作用


import regex

matches = regex.findall(r"\b(?<!word_A\W+(?:\w+\W+){0,2}?(?:word_B\W+(?:\w+\W+){0,3}?(word_C)", text, regex.IGNORECASE)

最佳答案

您可以通过几个前瞻来完成此操作，即在 N 个单词内对 word_B 进行正向前瞻，后跟 word_C，并对 word_A 进行负向前瞻code> 后跟 M 个字内的 word_B:

^(?=.*\bword_B\W+(?:\w+\W+){0,2}\b(word_C)\b)(?!.*\bword_A\W+(?:\w+\W+){0,1}\bword_B\b)

Demo on regex101

在Python中

import re
strings = ['word_A word word_B word word_C',
           'word_A word word word_B word',
           'word_A word word word word word_B word word word_C',
           'word word_B word word_C']
pattern = r'^(?=.*\bword_B\W+(?:\w+\W+){0,2}\b(word_C)\b)(?!.*\bword_A\W+(?:\w+\W+){0,1}\bword_B\b)'
for str in strings:
    matches = re.findall(pattern, str, re.IGNORECASE)
    print(matches)

输出

[]
[]
['word_C']
['word_C']

关于如果 word_B 和 word_C 彼此在 N 个单词内，并且 word_A 不在前面的 M 个单词中，则正则表达式匹配，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/59205394/

上一篇：javascript - 如何使 TypeScript 枚举限制为特定数字

下一篇：android - Flutter 更改右上角的文本位置

相关文章：

python - 从两个数组创建元素映射

python - 多次调用 dictConfig 是不好的做法吗？

python - 如何在使用append()时避免for循环

mysql - 在有限的情况下使用 REGEXP 将空格更改为连字符

正则表达式和 Powershell : Use Regex to Create Files based on name found in CSV

regex - 如何捕获两个字母？

python - 如何在脚本运行时向多处理队列添加更多项目

Java sqlite 按正则表达式分组

C# 正则表达式负向前瞻

python - rate_limit 不工作 celery