python - 如何将 beautifulsoup 文本转换为列表或可迭代对象

标签 python python-3.x beautifulsoup scrapy python-requests

如何将 beautifulsoup 文本转换为 list 或 dictionary？

我想从从 beautifulsoup 抓取中获得的信息中获取一个可迭代列表。例如，现在我抓取了一个引用网站并获得了文本，然后我想将这段文本引用放入一个列表中，以便我可以遍历它们。

from bs4 import BeautifulSoup
import requests

r = requests.get("http://www.great-quotes.com/quotes/category/Motivational")
data = r.text
soup = BeautifulSoup(data, 'html.parser')
# print(soup.prettify())

for quote in soup.find_all("span", class_="edit_body"):
    quotes = list(quotes)  # This gets me an error, name quote not defind
    print(quotes)

# This is is how I want my scraped quotes to look like

new_quote = ['quote', 'quote', 'quote']  # I want it to be in a list.

最佳答案

soup.find_all() 已经返回一个迭代器，其中包含符合您的规范的所有 HTML 标记。因此，您可以像使用列表一样使用此函数的输出:

quote_list = [quote_tag.text for quote_tag in soup.find_all("span", class_="edit_body")]
print(quote_list)
# Output: ['"What lies behind us and what lies before us are tiny matters compared to what lies within us."', '"Life is like a mirror. Smile at it and it smiles back at you."', ...]

关于python - 如何将 beautifulsoup 文本转换为列表或可迭代对象，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/50429780/

上一篇：python - 如何使物体平稳移动 tkinter

下一篇：python - 从导入的模块编辑类

相关文章：

python - 多条件提前停止

python-3.x - 如何转换 wav 文件 -> bytes-like 对象？

javascript - 如何在 python BeautifulSoup 或任何其他模块中获取 javascript 输出

python - 如何使用 BeautifulSoup4 获取 <br> 标签前的所有文本

python - 使用python中的beautifulsoup从具有更多文本内容的网页中提取数据

c++ - "right"将python脚本添加到非python应用程序的方法

python - Python 子进程的实时输出

python - 如何通过 Pandas 的出生日期获取年龄列？

用于 beta 分发的 Python scipy 重载 _stats 函数

python - 覆盖类提供的异常处理程序