我正在尝试遍历此 BeerAdvocate 页面 ( https://www.beeradvocate.com/beer/styles/35/ ) 以获取啤酒名称、酒精度数、评级等信息。但是,我不确定如何构建一个循环来遍历整个页面。
例如,我对啤酒名称的处理如下:
import requests
from bs4 import BeautifulSoup
url = "https://www.beeradvocate.com/beer/styles/35/"
results = requests.get(url)
soup = BeautifulSoup(requests.get(url).content, 'html.parser')
beer_name = []
beer_div = soup.find_all('div',id='ba-content')
for container in beer_div:
#beer name
name = container.find_all('a')[12].text
beer_name.append(name)
print(beer_name)
有谁知道我在这里做错了什么?谢谢!
最佳答案
首先识别table
,然后找到所有 tr
table
中的标签,然后循环遍历 tr
标签来打印文本。
beer_table = soup.find('table')
tr_tags = beer_table.find_all('tr')[3:]
for tr in tr_tags:
beer_name.append(tr.td.text)
beer_name = beer_name[:-1]
print(beer_name)
输出:['Ayinger Celebrator', 'Troegenator', 'Spaten Optimator', 'Salvator', 'Korbinian', 'Samichlaus Classic Bier', 'Samuel Adams Double Bock (Imperial Series)', 'Consecrator', 'Andechser Doppelbock Dunkel', 'Birra Moretti La Rossa', 'Perkulator Coffee Dopplebock', 'EKU 28', 'Liberator Doppelbock', 'Augustiner Bräu Maximator', "Smuttynose S'muttonator (Heritage Series)", 'Butthead Doppelbock', 'Autumnal Fire', 'Weltenburger Kloster Asam-Bock', 'Wasatch The Devastator Double Bock', 'St. Victorious', 'Urbock 23°', 'Voodoovator', 'Saxonator Dunkles Doppelbock', 'Doppel-Hirsch', 'Josephs Brau Winter Brew', 'Duck-Rabbator', 'Troegenator - Bourbon Barrel-Aged', 'Ettaler Curator Dunkler Doppelbock (US Import Version)', 'Blonde Doppelbock', 'Snow Blind Doppelbock', 'Doppelbock Dunkel', 'Tucher Bajuvator Doppelbock', 'Dark Heathen Triple Bock', 'Winter Bock', 'Deep Water Dopplebock', 'Doppelbock Grande Cuvée Printemps', 'Lobotomy Bock', 'Sled Dog Dopplebock', 'Primátor Double Bock Beer', 'Icelandic Doppelbock', 'Dopple Bock', "St. Nikolaus Bock Bier - Brewer's Reserve", 'Double Skull', 'Emancipator Doppelbock', 'Winter-Bock', 'Granitbock', "Henry's Farm Double Bock", 'Double Vision Doppelbock', 'Massacre', "Fireman's Brew Brunette Beer"]
这是完整的代码:import requests
from bs4 import BeautifulSoup
url = "https://www.beeradvocate.com/beer/styles/35/"
results = requests.get(url)
soup = BeautifulSoup(results.content, 'html.parser')
beer_name = []
beer_table = soup.find('table')
tr_tags = beer_table.find_all('tr')[3:]
for tr in tr_tags:
beer_name.append(tr.td.text)
beer_name = beer_name[:-1]
print(beer_name)
希望这会有所帮助!
关于python - BeautifulSoup - 如何遍历 "tr"标签?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/64291746/