尝试使用 header 标签过滤产品名称列表,但始终不返回任何内容。
来源:https://www.tendercuts.in/chicken
代码:
import requests
from bs4 import BeautifulSoup
def ExtractData(url):
response = requests.get(url=url).content
soup = BeautifulSoup(response, 'lxml')
header = soup.find("mat-card-header", {"class": "mat-card-header ng-tns- c9-188"})
print(header)
ExtractData(url="https://www.tendercuts.in/chicken")
最佳答案
会发生什么?
您尝试按汤中不存在的类查找标签,因为它是动态生成的和/或由拼写错误引起的。
如何修复?
通过 tag
或 id
选择更具体的元素,并避免使用类,因为这些类通常是动态创建的:
[t.text for t in soup.find_all('mat-card-title')]
为了避免重复,只需在结果上使用set()
:
set([t.text for t in soup.find_all('mat-card-title')])
示例
import requests
from bs4 import BeautifulSoup
URL = 'https://www.tendercuts.in/chicken'
r = requests.get(URL)
soup = BeautifulSoup(r.text)
print(set([t.text for t in soup.find_all('mat-card-title')]))
输出
{'Chicken Biryani Cut - Skin On','Chicken Biryani Cut - Skinless','Chicken Boneless (Cubes)','Chicken Breast Boneless','Chicken Curry Cut (Skin Off)','Chicken Curry Cut (Skin On)','Chicken Drumsticks', 'Chicken Liver','Chicken Lollipop','Chicken Thigh & Leg (Boneless)','Chicken Whole Leg','Chicken Wings','Country Chicken','Minced Chicken','Premium Chicken-Strips (Boneless)','Premium Chicken-Supreme (Boneless)','Smoky Country Chicken (Turmeric)'}
编辑
要获取标题、价格……我建议按以下方式迭代 mat-cards
。
import requests,re
from bs4 import BeautifulSoup
URL = 'https://www.tendercuts.in/chicken'
r = requests.get(URL)
soup = BeautifulSoup(r.text)
data = []
for item in soup.select('mat-card:has(mat-card-title)')[::2]:
data.append({
'title':item.find('mat-card-title').text,
'price':re.search(r'₹\d*',soup.find('p', class_='current-price').text).group(),
'weight':w if (w:=item.select_one('.weight span span:last-of-type').next_sibling) else None
})
print(data)
输出
[{'title': 'Chicken Curry Cut (Skin Off)', 'price': '₹99', 'weight': 'Customizable'}, {'title': 'Chicken Curry Cut (Skin On)', 'price': '₹99', 'weight': 'Customizable'}, {'title': 'Country Chicken', 'price': '₹99', 'weight': 'Customizable'}, {'title': 'Premium Chicken-Supreme (Boneless)', 'price': '₹99', 'weight': ' 330 - 350 Gms'}, {'title': 'Chicken Boneless (Cubes)', 'price': '₹99', 'weight': ' 480 - 500 Gms'}, {'title': 'Chicken Drumsticks', 'price': '₹99', 'weight': ' 280 - 360 Gms'}, {'title': 'Chicken Biryani Cut - Skin On', 'price': '₹99', 'weight': ' 480 - 500 Gms'}, {'title': 'Chicken Thigh & Leg (Boneless)', 'price': '₹99', 'weight': ' 480 - 500 Gms'}, {'title': 'Chicken Biryani Cut - Skinless', 'price': '₹99', 'weight': ' 480 - 500 Gms'}, {'title': 'Minced Chicken', 'price': '₹99', 'weight': ' 480 - 500 Gms'}, {'title': 'Smoky Country Chicken (Turmeric)', 'price': '₹99', 'weight': ' 650 - 800 Gms'}, {'title': 'Chicken Lollipop', 'price': '₹99', 'weight': ' 280 - 300 Gms'}, {'title': 'Chicken Whole Leg', 'price': '₹99', 'weight': ' 370 - 390 Gms'}, {'title': 'Chicken Breast Boneless', 'price': '₹99', 'weight': ' 240 - 280 Gms'}, {'title': 'Premium Chicken-Strips (Boneless)', 'price': '₹99', 'weight': ' 330 - 350 Gms'}, {'title': 'Chicken Liver', 'price': '₹99', 'weight': ' 190 - 210 Gms'}, {'title': 'Chicken Wings', 'price': '₹99', 'weight': ' 480 - 500 Gms'}]
关于python - 即使有元素, BeautifulSoup 也不会返回任何内容,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/71086324/