假设我们有一个如下所示的 html
:
<span title="Sports Football">Football</span>
<span title="Sports Badminton">Tennis</span>
<span title="Sports Ski Jump">Ski Jump</span>
如果 title
属性包含 Sports
,我想提取参数:
所以最后我们有一个变量sports
:
sports = ['Football', 'Badminton', 'Ski Jump']
这是我用的:
sports = soup.find_all('span', {'title': 'Sports'})
我什么都没有
最佳答案
您可以使用 re.compile
和 BeautifulSoup
来查找所有 span
标签,如果 title
的第一部分> 属性是 "Sports"
:
content = """
<span title="Sports Football">Football</span>
<span title="Sports Badminton">Tennis</span>
<span title="Sports Ski Jump">Ski Jump</span>
"""
import re
from bs4 import BeautifulSoup as soup
d = soup(content, 'html.parser')
results = [i.text for i in d.find_all('span', {'title':re.compile('^Sports\s')})]
输出:
['Football', 'Tennis', 'Ski Jump']
关于python - Beautiful Soup - 获取包含字符串的参数属性,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/53441208/