python - 如何正则表达式直到最后一次出现？

我正在使用Python，我需要正则表达式来获取网页的联系人链接。所以，我做了<a (.*?)>(.*?)Contacts(.*?)</a>结果是:

href="/ru/o-nas.html"  id="menu263" title="About">About</a></li><li><a href="/ru/photogallery.html" id="menu645" title="Photo">Photo</a></li><li  class="last"><a href="/ru/kontakt.html" class="last" id="menu583" title="">Contacts

，但我需要最后一个 <a ...喜欢

href="/ru/kontakt.html" class="last" id="menu583" title="">Contacts

我应该使用什么正则表达式模式？

Python 代码:

match = re.findall('<a (.*?)>(.*?)Contacts(.*?)</a>', body)
if match:
    for m in match:
        print ''.join(m)

最佳答案

由于您正在解析 HTML，我建议使用 BeautifulSoup

# sample html from question
html = '<li><a href="/ru/o-nas.html"  id="menu263" title="About">About</a></li><li><a href="/ru/photogallery.html" id="menu645" title="Photo">Photo</a></li><li  class="last"><a href="/ru/kontakt.html" class="last" id="menu583" title="">Contacts</a></li>'

from bs4 import BeautifulSoup
doc = BeautifulSoup(html)
aTag = doc.find('a', id='menu583') # id for Contacts link
print(aTag['href'])
# '/ru/kontakt.html'

关于python - 如何正则表达式直到最后一次出现？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/41101429/

上一篇：python - 对于数组中的每个点，在第二个数组中找到最接近它的点并输出该索引

下一篇：python - 如何将 re.split 中的分隔符保留在返回列表的同一索引上

相关文章：

python - 在 python 3 中使用全局变量

javascript - 查找字符串中的多个子字符串

c# - 验证另一个字符串之间的字符串？

regex - Delphi - 用户指定的字符串操作

python - 在独立的 Python 脚本中向 Google 进行身份验证的正确机制是什么？

python - Python 和 Scheme 在变量作用域上有什么区别？

ruby - 删除所有非字母数字字符，但保留变音符号(重音符号)和 -(破折号)

ruby 表达式

python - 奥杜 11 : How to get current record in javascript

python - 将浮点值舍入到区间限制/网格