python - * 在 Python 正则表达式匹配中有副作用吗？

我正在学习 Python 的正则表达式，以下是我预期的工作方式:

>>> import re
>>> re.split('\s+|:', 'find   a str:s2')
['find', 'a', 'str', 's2']

但是当我将 + 更改为 * 时，输出对我来说很奇怪:

>>> re.split('\s*|:', 'find  a str:s2')
['find', 'a', 'str:s2']

这种模式在 Python 中是如何解释的？

最佳答案

您看到的“副作用”是 re.split() 只会在超过 0 个字符的匹配项上拆分。

\s*|: 模式在零个或多个空格上匹配 either，在 : 上匹配或， 以先到者为准。但是零空格到处匹配。在超过零个空格匹配的那些位置，进行拆分。

因为 \s* 模式在每次考虑拆分字符时都匹配，所以永远不会考虑下一个选项 :。

Note that split will never split a string on an empty pattern match.

如果反转模式，:会被考虑，因为它是首选:

>>> re.split(':|\s*', 'find  a str:s2')
['find', 'a', 'str', 's2']

关于python - * 在 Python 正则表达式匹配中有副作用吗？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/24389803/

相关文章：

python - Flask - 不受当前上下文的限制