Python:在<br/>之前提取</span>之后的文本

标签 python html beautifulsoup

这是我要处理的 html 文件:

<span class="pl">Countries:</span> USA <br/>
<span class="pl">Language:</span> English <br/>

这是我的 python 代码:

from bs4 import BeautifulSoup

record=[]
soup=BeautifulSoup(html)
spans=soup.find_all('span')
for span in spans:
   record.append(span.text)

我最终得到的是:

Countries: Language:

结果遗漏了一些重要信息:“USA”和“English” 如何获取文本？

最佳答案

使用.next_sibling符号:

soup.find("span", text="Countries:").next_sibling
soup.find("span", text="Language:").next_sibling

关于Python:在<br/>之前提取</span>之后的文本，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/36122074/

上一篇：Python 使用 virtualenv 来防止库版本冲突。 ruby 是怎么做到的？

下一篇：python - Matplotlib，从 pandas 数据帧在 barplot 上绘制多个单独的 hline

相关文章：

python - 在 Python 中从一组数字创建 "slice notation"样式列表

python - 如何连接两个列表以使元素处于替代位置？

jquery - 当子菜单显示时，如何停止我的 html/css/jquery 菜单文本移动？

javascript - 在 jQuery 中的 Math.random() 之后找到 "li"

python - 找到 html 元素 bs4 子元素的最快方法

python - 使用 python 中的 hypopt 包在 GridSearch 函数中指定评分指标

python - 如何将单词计数转换为 Python 中的实际单词列表

css - 具有绝对内部元素的相对文章 0px 高度

Python/bs4 : Span inside div tag - text extraction

python - 我正在尝试抓取数据，但它只获取 10 页的数据，而有 26 页

©2024 IT工具网联系我们