gpt4 book ai didi

python - 网页抓取 Yelp,我如何检索每个单独评分的值?

转载 作者:行者123 更新时间:2023-12-04 16:21:08 27 4
gpt4 key购买 nike

这个问题在这里已经有了答案:





Extracting an attribute value with beautifulsoup

(10 个回答)


去年关闭。




从事网络抓取项目以建立我的知识(初学者)。这段代码很乱,但我现在可以打印每条评论的评分。如何从列表中的 bs4 对象(即 4.0, 5,0 )中提取评分,然后对它们求平均值?

Output:
[<meta content="4.0" itemprop="ratingValue"/>, <meta content="5.0" itemprop="ratingValue"/>, ... ]
import mechanize
from bs4 import BeautifulSoup

def searchYelp():

br = mechanize.Browser()
br.set_handle_robots(False)
br.addheaders = [('User-agent', 'Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.0.1) Gecko/2008071615 Fedora/3.0.1-1.fc9 Firefox/3.0.1')]

response = br.open('https://www.yelp.com')
br.select_form(nr=0)
br.form['find_desc'] = 'Del Taco'
br.form['find_loc'] = 'New York City'
br.submit()

link_list = []
for link in br.links():
if link.url.startswith('/biz/'):
link_list.append(link.url)
break

big_list_of_ratings = []
yelpPage = br.open(link_list[0])
soup = BeautifulSoup(yelpPage.read(), 'html.parser')

for review in soup.find_all('meta'):
if review.get('itemprop') == 'ratingValue':
big_list_of_ratings.append(review)

print(big_list_of_ratings)


searchYelp()

最佳答案

而不是这个

for review in soup.find_all('meta'):
if review.get('itemprop') == 'ratingValue':
big_list_of_ratings.append(review)

添加这样的属性 review['content']
  for review in soup.find_all('meta'):
if review.get('itemprop') == 'ratingValue':
big_list_of_ratings.append(review['content'])

或者我建议使用 css 选择器。
for review in soup.select('meta[itemprop="ratingValue"][content]'):
big_list_of_ratings.append(review['content'])

关于python - 网页抓取 Yelp,我如何检索每个单独评分的值?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/59778182/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com