gpt4 book ai didi

python-3.x - Beautifulsoup 捕获了名字而不是网页的元分数

转载 作者:行者123 更新时间:2023-12-02 01:00:35 26 4
gpt4 key购买 nike

我得到了我想要的名字,但没有用这段代码得到相应的 Metascore:

from requests import get
from bs4 import BeautifulSoup
from urllib.request import Request, urlopen

# Define the URL
url = "http://www.metacritic.com/browse/games/score/metascore/year/pc/filtered?sort=desc&year_selected=2018"

# not sure about this but it works (I was getting blocked by something and this the way I found around it)
req = Request(url, headers={'User-Agent': 'Mozilla/5.0'})

web_byte = urlopen(req).read()

webpage = web_byte.decode('utf-8')

#this grabs the all the text from the page
html_soup = BeautifulSoup(webpage, 'html5lib')

#this is for selecting all the games in from 1 to 100 (the list of them)
game_containers = html_soup.find_all("div", class_="product_item product_title")

# print(game_containers)

game_names = html_soup.find_all("div", class_="product_item product_title")
game_metascores_p = html_soup.find_all("div", class_="metascore_w small game positive")
game_metascores_m = html_soup.find_all("div", class_="metascore_w small game mixed")
game_user_s = html_soup.find_all("span", class_="data textscore textscore_favorable")

#lists to store the data
names = []
metascores = []
userscores = []

#Extract data from each game
for games in game_names:

name = games.find()
names.append(name.text.strip())

metascore = games.find_next_sibling.()
metascores.append(metascore.text.strip())

当我运行游戏名称时:

print(names)

我得到了 100 个名字的列表,只是字符串(这就是我想要的)

当我运行这个时:

print(metascores)

我明白了:

['User:\n    7.6', 'User:\n    7.8', 'User:\n    7.0', 'User:\n    8.2', 'User:\n    7.3', 'User:\n    5.9', 'User:\n    7.2', 'User:\n    7.8', 'User:\n    8.1', 'User:\n    7.0', 'User:\n    8.5', 'User:\n    6.6', 'User:\n    7.2', 'User:\n    7.2', 'User:\n    7.3', 'User:\n    7.2', 'User:\n    7.5', 'User:\n    6.5', 'User:\n    7.5', 'User:\n    7.9', 'User:\n    7.8', 'User:\n    7.2', 'User:\n    7.6', 'User:\n    tbd', 'User:\n    7.9', 'User:\n    7.1', 'User:\n    6.1', 'User:\n    6.0', 'User:\n    tbd', 'User:\n    7.1', 'User:\n    6.6', 'User:\n    8.0', 'User:\n    7.7', 'User:\n    tbd', 'User:\n    7.5', 'User:\n    tbd', 'User:\n    8.1', 'User:\n    7.8', 'User:\n    7.7', 'User:\n    tbd', 'User:\n    7.9', 'User:\n    tbd', 'User:\n    5.4', 'User:\n    8.0', 'User:\n    tbd', 'User:\n    7.7', 'User:\n    8.0', 'User:\n    6.3', 'User:\n    8.0', 'User:\n    6.2', 'User:\n    8.3', 'User:\n    8.2', 'User:\n    8.3', 'User:\n    8.1', 'User:\n    5.1', 'User:\n    6.5', 'User:\n    7.5', 'User:\n    7.3', 'User:\n    6.7', 'User:\n    7.9', 'User:\n    tbd', 'User:\n    tbd', 'User:\n    7.2', 'User:\n    tbd', 'User:\n    tbd', 'User:\n    6.9', 'User:\n    5.4', 'User:\n    6.9', 'User:\n    tbd', 'User:\n    6.6', 'User:\n    7.9', 'User:\n    4.0', 'User:\n    6.8', 'User:\n    tbd', 'User:\n    6.1', 'User:\n    4.5', 'User:\n    6.2', 'User:\n    8.3', 'User:\n    4.5', 'User:\n    4.9', 'User:\n    7.7', 'User:\n    4.7', 'User:\n    7.9', 'User:\n    tbd', 'User:\n    tbd', 'User:\n    tbd', 'User:\n    6.9', 'User:\n    6.0', 'User:\n    tbd', 'User:\n    tbd', 'User:\n    tbd', 'User:\n    tbd', 'User:\n    4.6', 'User:\n    7.3', 'User:\n    tbd', 'User:\n    7.5', 'User:\n    6.8', 'User:\n    6.4', 'User:\n    tbd', 'User:\n    4.1']

这是用户分数(在下一个将是用户分数的变量上,我想只获取不包括“'User:\n'”的数字或待定)

那么我如何获得元分数和用户分数(只是字符串)?

最佳答案

您可以使用replace():

str.replace("User:\n    ", "")

像这样:

metascoresNew = []
for i in metascores:
temp = str(i)
temp2 = temp.replace("User:\n ", "")
metascoresNew.append(temp2)
print(metascoresNew)

输出将是:

['7.6', '7.8', '7.0', '8.2'...]

演示 here

关于python-3.x - Beautifulsoup 捕获了名字而不是网页的元分数,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/50891072/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com