python - find_all() 仅返回列表的第一项

我在使用 BeautifulSoup 时遇到一些问题，使用 find_all() 方法。我正在尝试获取所有 p 标记之间的文本，但它仅返回列表的第一个元素。实际上列表只有一项。为什么 find_all() 方法只返回一项？

这是我想要提取的代码的一部分:

<div class="post-content">
 <p>If you’re not familiar with Deep Image, it’s an amazing tool which allows you to increase the size of an image and upgrade its quality at the same time.</p>

 <p>You can find it, and use for free <a href="https://deep-image.ai/">HERE</a></p>

 <p><em>The goal of this blog post is to focus on the main changes and showcase the results of DI 2.0 algorithms.</em></p>

 <p>As we all know a picture is worth a thousand words. So we will let the enhanced pictures speak for themselves. All pictures you can see below were processed using Deep Image algorithms.</p>

 <h2 id="what-has-changed">What has changed</h2>

 <p>Here are all the main improvements added to Deep Image 2.0:</p>
</div>

这是我的代码:

from bs4 import BeautifulSoup
import requests

source = requests.get('https://teonite.com/blog/deep-image-2-showcasing-results/').text
soup = BeautifulSoup(source, 'html.parser')

for article in soup.find_all(class_='post-content'):
    print(article.p.text)

感谢您的帮助!

最佳答案

您正在搜索 post-content 类的所有标签。虽然只有一个这样的元素，但 find_all 返回一个包含单个条目的列表。因此，您的 for 循环中只有一次迭代，并且仅打印该迭代中第一个 p 标记的文本。

试试这个:

from bs4 import BeautifulSoup
import requests

html = '''
<div class="post-content">
 <p>If you’re not familiar with Deep Image, it’s an amazing tool which allows you to increase the size of an image and upgrade its quality at the same time.</p>

 <p>You can find it, and use for free <a href="https://deep-image.ai/">HERE</a></p>

 <p><em>The goal of this blog post is to focus on the main changes and showcase the results of DI 2.0 algorithms.</em></p>

 <p>As we all know a picture is worth a thousand words. So we will let the enhanced pictures speak for themselves. All pictures you can see below were processed using Deep Image algorithms.</p>

 <h2 id="what-has-changed">What has changed</h2>

 <p>Here are all the main improvements added to Deep Image 2.0:</p>
</div>
'''

soup = BeautifulSoup(html, 'html.parser')
div = soup.find(class_='post-content')
for p in div.find_all('p'):
    print(p.text)

您将获得 p 标记内所有文本的所需输出，因为我们现在搜索具有 post-content 类的元素，然后搜索所有 p 此元素内的标签。

关于python - find_all() 仅返回列表的第一项，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/57310276/

python - find_all() 仅返回列表的第一项

上一篇：c# - JSON 序列化器 - 当前端缺少设置为 false 的 bool 值时

下一篇：c# - 单击菜单“添加项目”时如何根据特定模板添加项目？