gpt4 book ai didi

Python LXML 从 Steam Bundle 页面获取数据 - 列出索引错误

转载 作者:行者123 更新时间:2023-12-04 15:05:32 25 4
gpt4 key购买 nike

我正在开发 python 程序,在它获得 Steam bundle 的 ID 后 - 它返回当前价格

程序正在使用requestslxml

最终价格有两条路径:

  1. /html/body/div[1]/div[7]/div[4]/div[1]/div[2]/div/div[2]/div[10]/div[3]/股本
  2. //*[@id="game_area_purchase"]/div/div/div/div[1]/div/div/div[2]

使用示例:https://store.steampowered.com/bundle/16140

代码如下:

import requests
import lxml.html

#example URL for steam bundle
URL = "https://store.steampowered.com/bundle/16140"

html = requests.get(URL)
doc = lxml.html.fromstring(html.content)

#xpath to price location
price = doc.xpath('/html/body/div[1]/div[7]/div[4]/div[1]/div[2]/div/div[2]/div[10]/div[3]/div/text()')

print(price)

程序返回:

[]

或者这个

Traceback (most recent call last):
File <path-to-program>, line 9, in <module>
price = doc.xpath('/html/body/div[1]/div[7]/div[4]/div[1]/div[2]/div/div[2]/div[10]/div[3]/div/text()')[0]
IndexError: list index out of range

两个选项都出错。我应该怎么做才能修复它?

最佳答案

要获得所需的页面 HTML,您需要添加带有 birthtime cookie 的请求,“告诉”服务器您的年龄允许您访问包含性/裸露内容的页面:

import requests
import lxml.html

URL = "https://store.steampowered.com/bundle/16140"
session = requests.Session()
r1 = session.get(URL)
r1.cookies['birthtime']='439423201' # this is date in seconds since "epoch" (January 1, 1970)
r2 = session.get(URL, cookies=r1.cookies)

doc = lxml.html.fromstring(r2.content)
print(doc.xpath('//div[contains(@class, "discount_final_price")]/text()')[0])

关于Python LXML 从 Steam Bundle 页面获取数据 - 列出索引错误,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/66228831/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com