python - 提取括号内和括号之间的元素

我有以下字符串，我想提取元素 (xx="yy") 以及括号之间的内容。这是一个例子:

[caption id="get this" align="and this" width="and this" caption="and this"]this too please[/caption]

我试过下面的代码，但我对正则表达式还是个菜鸟。

re.sub(r'\[caption id="(.*)" align="(.*)" width="(.*)" caption="(.*)"\](.*)\[\/caption\]', "tokens: %1 %2 %3 %4 %5", self.content, re.IGNORECASE)

提前致谢!

最佳答案

它可能不适合你，因为 .* 是贪婪的。尝试用 [^"]* 代替它。[^"] 表示除引号字符外的所有字符的集合。此外，正如您在评论中指出的那样， token 语法是 \\n，而不是 %n。试试这个:

re.sub(r'\[caption id="([^"]*)" align="([^"]*)" width="([^"]*)" caption="([^"]*)"\](.*)\[\/caption\]', "tokens: \\1 \\2 \\3 \\4 \\5", self.content, re.IGNORECASE)

标题标签的内容是否跨越多行？如果他们这样做，.* 将不会捕获换行符。您需要向我们提供类似 [^\x00]* 的内容。 [^\x00]表示除空字符外的所有字符的集合。

re.sub(r'\[caption id="([^"]*)" align="([^"]*)" width="([^"]*)" caption="([^"]*)"\]([^\x00]*)\[\/caption\]', "tokens: \\1 \\2 \\3 \\4 \\5", self.content, re.IGNORECASE)

如果您的字符串实际上可以合法地包含空字符，则您需要改用 re.DOTALL 标志。

关于python - 提取括号内和括号之间的元素，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/7284570/

上一篇：python - 为现有的 Plone 3.3.4 安装向 Python 添加 SSL 支持

下一篇：python - Python 猜测列表

python - 可在 C 中调用的高精度(~200 sig figs)不完整 Gamma 函数

python - 我们如何获取 theano 表达式所依赖的变量列表？

c - regexec在c中获取xml标签的值

java - Pattern.matches不起作用，而replaceAll起作用

Javascript RegExp，捕获组失败

regex - .htaccess 强制服务器添加尾部斜杠，带有扩展的链接除外

python - 未绑定(bind)本地错误 : local variable 'url_request' referenced before assignment

Python 日期时间转换

regex - Swift - 使用正则表达式拆分字符串 - 忽略搜索字符串