我想提取新闻页面阅读次数最多的部分中的标题。这是我到目前为止所拥有的,但我正在获得所有标题。我只想要阅读次数最多的部分中的内容。
`
import requests
from bs4 import BeautifulSoup
base_url = 'https://www.michigandaily.com/section/opinion'
r = requests.get(base_url)
soup = BeautifulSoup(r.text, "html5lib")
for story_heading in soup.find_all(class_= "views-field views-field-title"):
if story_heading.a:
print(story_heading.a.text.replace("\n", " ").strip())
else:
print(story_heading.contents[0].strip())`
最佳答案
您需要将范围限制为仅包含阅读次数最多的文章的 div 容器。
import requests
from bs4 import BeautifulSoup
base_url = 'https://www.michigandaily.com/section/opinion'
r = requests.get(base_url)
soup = BeautifulSoup(r.text, "html5lib")
most_read_soup = soup.find_all('div', {'class': 'view-id-most_read'})[0]
for story_heading in most_read_soup.find_all(class_= "views-field views-field-title"):
if story_heading.a:
print(story_heading.a.text.replace("\n", " ").strip())
else:
print(story_heading.contents[0].strip())
关于python - 使用 BS4 提取最常阅读的标题,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/36630285/