python - BeautifulSoup 找到一个属性中包含空格的 html 元素

如何使用BeautifulSoup查找属性中包含空格的html元素

<h1 class='td p1'>
    title that i want
</h1>
<h1 class='td p2'>
    title that i don't want
</h1>
<h1 class='p1'>
    title that i don't want
</h1>

我想知道如何使用soup.find找到 title that i want 。
因为beautifulsoup考虑 title 'that i want' 的属性 attrs像这样:{'class': ['td', 'p1']}.<br>

但不是这样的:{'class': ['td p1']}

最佳答案

注意 不同的方法，但都有共同点来显式选择类。

查找()

soup.find('h1', attrs={'class':'td p1'})

select_one()

soup.select_one('h1.td.p1')

示例

from bs4 import BeautifulSoup
data="""
<h1 class='td p1'>
    title that i want
</h1>
<h1 class='td p2'>
    title that i don't want
</h1>
<h1 class='p1'>
    title that i don't want
</h1>
"""
soup=BeautifulSoup(data,"html.parser")

title = soup.select_one('h1.td.p1')

print(title)

输出

<h1 class="td p1">
    title that i want
</h1>

关于python - BeautifulSoup 找到一个属性中包含空格的 html 元素，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/70331559/

上一篇：JMeter - 无法单击正文数据选项卡

下一篇：github-actions - GitHub Actions 中的链式测试和发布工作流程

相关文章：

html - R - 带有输入的网络抓取动态表单

python - Scraperwiki Python 循环问题

python - 从 BeautifulSoup 的表中排除 Span 类

python - 枚举中有大量常量

Python 应用引擎；获取用户信息和发布参数？

python - matplotlib set_yticks 去掉imshow的上下半行

python - 用for循环反转字符串句子？

python - 无法提取具有 beautifulsoup 中指定的类的标签

python - 从 Python 列表中删除 BeautifulSoup 标签

python - 从 <a> BeautifulSoup 中提取 href