如何将 beautifulsoup
文本转换为 list
或 dictionary
?
我想从从 beautifulsoup 抓取中获得的信息中获取一个可迭代列表。例如,现在我抓取了一个引用网站并获得了文本,然后我想将这段文本引用放入一个列表中,以便我可以遍历它们。
from bs4 import BeautifulSoup
import requests
r = requests.get("http://www.great-quotes.com/quotes/category/Motivational")
data = r.text
soup = BeautifulSoup(data, 'html.parser')
# print(soup.prettify())
for quote in soup.find_all("span", class_="edit_body"):
quotes = list(quotes) # This gets me an error, name quote not defind
print(quotes)
# This is is how I want my scraped quotes to look like
new_quote = ['quote', 'quote', 'quote'] # I want it to be in a list.
最佳答案
soup.find_all()
已经返回一个迭代器,其中包含符合您的规范的所有 HTML 标记。因此,您可以像使用列表一样使用此函数的输出:
quote_list = [quote_tag.text for quote_tag in soup.find_all("span", class_="edit_body")]
print(quote_list)
# Output: ['"What lies behind us and what lies before us are tiny matters compared to what lies within us."', '"Life is like a mirror. Smile at it and it smiles back at you."', ...]
关于python - 如何将 beautifulsoup 文本转换为列表或可迭代对象,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/50429780/