我得到了我想要的名字,但没有用这段代码得到相应的 Metascore:
from requests import get
from bs4 import BeautifulSoup
from urllib.request import Request, urlopen
# Define the URL
url = "http://www.metacritic.com/browse/games/score/metascore/year/pc/filtered?sort=desc&year_selected=2018"
# not sure about this but it works (I was getting blocked by something and this the way I found around it)
req = Request(url, headers={'User-Agent': 'Mozilla/5.0'})
web_byte = urlopen(req).read()
webpage = web_byte.decode('utf-8')
#this grabs the all the text from the page
html_soup = BeautifulSoup(webpage, 'html5lib')
#this is for selecting all the games in from 1 to 100 (the list of them)
game_containers = html_soup.find_all("div", class_="product_item product_title")
# print(game_containers)
game_names = html_soup.find_all("div", class_="product_item product_title")
game_metascores_p = html_soup.find_all("div", class_="metascore_w small game positive")
game_metascores_m = html_soup.find_all("div", class_="metascore_w small game mixed")
game_user_s = html_soup.find_all("span", class_="data textscore textscore_favorable")
#lists to store the data
names = []
metascores = []
userscores = []
#Extract data from each game
for games in game_names:
name = games.find()
names.append(name.text.strip())
metascore = games.find_next_sibling.()
metascores.append(metascore.text.strip())
当我运行游戏名称时:
print(names)
我得到了 100 个名字的列表,只是字符串(这就是我想要的)
当我运行这个时:
print(metascores)
我明白了:
['User:\n 7.6', 'User:\n 7.8', 'User:\n 7.0', 'User:\n 8.2', 'User:\n 7.3', 'User:\n 5.9', 'User:\n 7.2', 'User:\n 7.8', 'User:\n 8.1', 'User:\n 7.0', 'User:\n 8.5', 'User:\n 6.6', 'User:\n 7.2', 'User:\n 7.2', 'User:\n 7.3', 'User:\n 7.2', 'User:\n 7.5', 'User:\n 6.5', 'User:\n 7.5', 'User:\n 7.9', 'User:\n 7.8', 'User:\n 7.2', 'User:\n 7.6', 'User:\n tbd', 'User:\n 7.9', 'User:\n 7.1', 'User:\n 6.1', 'User:\n 6.0', 'User:\n tbd', 'User:\n 7.1', 'User:\n 6.6', 'User:\n 8.0', 'User:\n 7.7', 'User:\n tbd', 'User:\n 7.5', 'User:\n tbd', 'User:\n 8.1', 'User:\n 7.8', 'User:\n 7.7', 'User:\n tbd', 'User:\n 7.9', 'User:\n tbd', 'User:\n 5.4', 'User:\n 8.0', 'User:\n tbd', 'User:\n 7.7', 'User:\n 8.0', 'User:\n 6.3', 'User:\n 8.0', 'User:\n 6.2', 'User:\n 8.3', 'User:\n 8.2', 'User:\n 8.3', 'User:\n 8.1', 'User:\n 5.1', 'User:\n 6.5', 'User:\n 7.5', 'User:\n 7.3', 'User:\n 6.7', 'User:\n 7.9', 'User:\n tbd', 'User:\n tbd', 'User:\n 7.2', 'User:\n tbd', 'User:\n tbd', 'User:\n 6.9', 'User:\n 5.4', 'User:\n 6.9', 'User:\n tbd', 'User:\n 6.6', 'User:\n 7.9', 'User:\n 4.0', 'User:\n 6.8', 'User:\n tbd', 'User:\n 6.1', 'User:\n 4.5', 'User:\n 6.2', 'User:\n 8.3', 'User:\n 4.5', 'User:\n 4.9', 'User:\n 7.7', 'User:\n 4.7', 'User:\n 7.9', 'User:\n tbd', 'User:\n tbd', 'User:\n tbd', 'User:\n 6.9', 'User:\n 6.0', 'User:\n tbd', 'User:\n tbd', 'User:\n tbd', 'User:\n tbd', 'User:\n 4.6', 'User:\n 7.3', 'User:\n tbd', 'User:\n 7.5', 'User:\n 6.8', 'User:\n 6.4', 'User:\n tbd', 'User:\n 4.1']
这是用户分数(在下一个将是用户分数的变量上,我想只获取不包括“'User:\n'”的数字或待定)
那么我如何获得元分数和用户分数(只是字符串)?
最佳答案
您可以使用replace()
:
str.replace("User:\n ", "")
像这样:
metascoresNew = []
for i in metascores:
temp = str(i)
temp2 = temp.replace("User:\n ", "")
metascoresNew.append(temp2)
print(metascoresNew)
输出将是:
['7.6', '7.8', '7.0', '8.2'...]
演示 here
关于python-3.x - Beautifulsoup 捕获了名字而不是网页的元分数,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/50891072/