Python Regex Sub - 在替换中使用匹配作为字典键

我正在将一个程序从 Perl 翻译成 Python (3.3)。我是 Python 的新手。在 Perl 中，我可以进行狡猾的正则表达式替换，例如:

$string =~ s/<(\w+)>/$params->{$1}/g;

这将搜索 $string，对于包含在 <> 中的每组单词字符，将使用正则表达式匹配从 $params 哈希引用中进行替换作为散列键。

简明地复制此行为的最佳(Pythonic)方法是什么？我想出了一些类似的东西:

string = re.sub(r'<(\w+)>', (what here?), string)

如果我可以传递一个将正则表达式匹配映射到字典的函数，那就太好了。这可能吗？

感谢您的帮助。

最佳答案

您可以将可调用对象传递给 re.sub 以告诉它如何处理匹配对象。

s = re.sub(r'<(\w+)>', lambda m: replacement_dict.get(m.group()), s)

dict.get 的使用允许您在所述单词不在替换字典中时提供“后备”，即

lambda m: replacement_dict.get(m.group(), m.group()) 
# fallback to just leaving the word there if we don't have a replacement

我会注意到，在使用 re.sub(和系列，即 re.split)时，指定周围存在的东西时你想要的替换，使用环视表达式通常更干净，这样你的比赛周围的东西就不会被替换掉。所以在这种情况下，我会像这样写你的正则表达式

r'(?<=<)(\w+)(?=>)'

否则，您必须在 lambda 中对括号进行一些拼接。为了清楚我在说什么，举个例子:

s = "<sometag>this is stuff<othertag>this is other stuff<closetag>"

d = {'othertag': 'blah'}

#this doesn't work because `group` returns the whole match, including non-groups
re.sub(r'<(\w+)>', lambda m: d.get(m.group(), m.group()), s)
Out[23]: '<sometag>this is stuff<othertag>this is other stuff<closetag>'

#this output isn't exactly ideal...
re.sub(r'<(\w+)>', lambda m: d.get(m.group(1), m.group(1)), s)
Out[24]: 'sometagthis is stuffblahthis is other stuffclosetag'

#this works, but is ugly and hard to maintain
re.sub(r'<(\w+)>', lambda m: '<{}>'.format(d.get(m.group(1), m.group(1))), s)
Out[26]: '<sometag>this is stuff<blah>this is other stuff<closetag>'

#lookbehind/lookahead makes this nicer.
re.sub(r'(?<=<)(\w+)(?=>)', lambda m: d.get(m.group(), m.group()), s)
Out[27]: '<sometag>this is stuff<blah>this is other stuff<closetag>'

关于Python Regex Sub - 在替换中使用匹配作为字典键，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/22545114/

Python Regex Sub - 在替换中使用匹配作为字典键

上一篇：python - 应用于方法的可调用对象装饰器不会在输入中获取自参数

下一篇：python - 从 .qrc 文件(使用 pyside-rcc )编译的 .py 文件不起作用