gpt4 book ai didi

javascript - 如何渲染JS为cookie生成指纹?

转载 作者:行者123 更新时间:2023-11-29 22:56:53 26 4
gpt4 key购买 nike

本网站使用JS设置cookie。

如何运行 JS 模拟浏览器以避免 429 错误?

from requests_html import HTMLSession

with HTMLSession() as s:
url = 'https://www.realestate.com.au/auction-results/nsw'
r = s.get(url)
print(r.status_code)
print(r.text)

r.html.render()
print(r.text)

最佳答案

如果没有某种形式的浏览器模拟,似乎几乎不可能绕过指纹(甚至仍然使用 seleniumm,我必须设置一些选项)。这是我使用 Selenium 获取发出请求所需的唯一关键信息(名为“FGJK”的 cookie)的想法,该信息在后续请求 header 中发送,并异步获取所有郊区结果页面。

from requests_html import AsyncHTMLSession
import asyncio
from selenium import webdriver
import nest_asyncio

#I'm using IPython which doesn't like async unless the following is applied:
nest_asyncio.apply()

async def get_token():
options = webdriver.ChromeOptions()
options.add_experimental_option('excludeSwitches', ['enable-automation'])
driver = webdriver.Chrome(options=options)
driver.get('https://www.realestate.com.au/auction-results/nsw')
cookies = driver.get_cookies()
while True:
for cookie in cookies:
if cookie['name'] == 'FGJK':
token = cookie['value']
return token
else:
cookies = driver.get_cookies()


async def get_results(s, endpoint, headers):
r = await s.get(f'https://www.realestate.com.au/auction-results/{endpoint}', headers=headers)
#do something with r.html
print(r, endpoint)


async def main():
token = await get_token()
s = AsyncHTMLSession()
headers = {'Cookie': f'FGJK={token}',
'user-agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.103 Safari/537.36'}

r = await s.get(f'https://sales-events-api.realestate.com.au/sales-events/nsw')
suburbs = r.json()['data']['suburbResults']
endpoints = [burb['suburb']['urlValue'] for burb in suburbs]
asyncio.gather(*(get_results(s, endpoint, headers) for endpoint in endpoints))


asyncio.run(main())

关于javascript - 如何渲染JS为cookie生成指纹?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/56493960/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com