我想替换字符串中匹配的 re 模式的文本,可以使用 re.sub()
来实现。如果我在调用中将函数作为 repl
参数传递给它,它会按预期工作,如下所示:
from __future__ import print_function
import re
pattern = r'(?P<text>.*?)(?:<(?P<tag>\w+)>(?P<content>.*)</(?P=tag)>|$)'
my_str = "Here's some <first>sample stuff</first> in the " \
"<second>middle</second> of some other text."
def replace(m):
return ''.join(map(lambda v: v if v else '',
map(m.group, ('text', 'content'))))
cleaned = re.sub(pattern, replace, my_str)
print('cleaned: {!r}'.format(cleaned))
输出:
cleaned: "Here's some sample stuff in the middle of some other text."
然而,从文档来看,我应该能够通过将替换字符串传递给它并引用其中的命名组来获得相同的结果。然而,我这样做的尝试并没有奏效,因为有时一个组是不匹配的,并且为其返回的值为 None
(而不是空字符串 ''
)。
cleaned = re.sub(pattern, r'\g<text>\g<content>', my_str)
print('cleaned: {!r}'.format(cleaned))
输出:
Traceback (most recent call last):
File "test_resub.py", line 21, in <module>
cleaned = re.sub(pattern, r'\g<text>\g<content>', my_str)
File "C:\Python\lib\re.py", line 151, in sub
return _compile(pattern, flags).sub(repl, string, count)
File "C:\Python\lib\re.py", line 278, in filter
return sre_parse.expand_template(template, match)
File "C:\Python\lib\sre_parse.py", line 802, in expand_template
raise error, "unmatched group"
sre_constants.error: unmatched group
我做错了什么或没理解什么?
最佳答案
def repl(matchobj):
if matchobj.group(3):
return matchobj.group(1)+matchobj.group(3)
else:
return matchobj.group(1)
my_str = "Here's some <first>sample stuff</first> in the " \
"<second>middle</second> of some other text."
pattern = r'(?P<text>.*?)(?:<(?P<tag>\w+)>(?P<content>.*)</(?P=tag)>|$)'
print re.sub(pattern, repl, my_str)
可以使用re.sub
的通话功能.
编辑:
cleaned = re.sub(pattern, r'\g<text>\g<content>', my_str)
这不会像字符串的最后一位匹配时那样工作,即 of some other text.
有\g<text>
定义但没有 \g<content>
因为没有内容。但你还是问re.sub
这样做。所以它会产生错误。如果您使用字符串 "Here's some <first>sample stuff</first> in the <second>middle</second>"
那么你的print re.sub(pattern,r"\g<text>\g<content>", my_str)
将作为 \g<content>
工作一直在这里定义。
关于python - 用 re.sub 替换命名的捕获组,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/27628601/