与此相关的场景非常相似;但我一直在和别人比较。
Getting from Clustered Nodes但不知何故;我不确定为什么我的 for 循环
不迭代并从其他元素获取文本,而仅从节点的第一个元素获取文本。
from requests import get
from bs4 import BeautifulSoup
url = 'https://shopee.com.my/'
l = []
headers = {'User-Agent': 'Googlebot/2.1 (+http://www.google.com/bot.html)'}
response = get(url, headers=headers)
html_soup = BeautifulSoup(response.text, 'html.parser')
def findDiv():
try:
for container in html_soup.find_all('div', {'class': 'section-trending-search-list'}):
topic = container.select_one(
'div._1waRmo')
if topic:
print(1)
d = {
'Titles': topic.text.replace("\n", "")}
print(2)
l.append(d)
return d
except:
d = None
findDiv()
print(l)
最佳答案
from requests import get
from bs4 import BeautifulSoup
url = 'https://shopee.com.my/'
l = []
headers = {'User-Agent': 'Googlebot/2.1 (+http://www.google.com/bot.html)'}
response = get(url, headers=headers)
html_soup = BeautifulSoup(response.text, 'html.parser')
def findDiv():
try:
for container in html_soup.find_all('div', {'class': '_25qBG5'}):
topic = container.select_one('div._1waRmo')
if topic:
d = {'Titles': topic.text.replace("\n", "")}
l.append(d)
return d
except:
d = None
findDiv()
print(l)
输出:
[{'Titles': 'school backpack'}, {'Titles': 'oppo case'}, {'Titles': 'baby chair'}, {'Titles': 'car holder'}, {'Titles': 'sling beg'}]
我再次建议您使用selenium 。如果再次运行此命令,您将看到列表中将出现一组不同的 5 个词典。每次您提出请求时,他们都会随机提供 5 个热门商品。但他们确实有一个“更改”按钮。如果您使用 selenium,您可能只需单击它即可继续抓取所有热门项目。
关于python - BeautifulSoup 循环不迭代其他节点,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/54087406/