gpt4 book ai didi

javascript - Python selenium 网页抓取 - 卡住页面

转载 作者:行者123 更新时间:2023-11-30 19:50:11 24 4
gpt4 key购买 nike

我正在使用 selenium 抓取一些页面,并且我不使用其他框架(如 scrapy 等),因为有很多 ajax 操作。我的问题是内容几乎每秒都会自动刷新(例如财务数据),但我想抓取静态的所有元素。我在互联网上搜索了很多,尤其是在 stackoverflow 上。用 Selenium 卡住网站的最简单方法是什么?我什至尝试关闭无线适配器,但这是一个问题......这是我在 selenium 文档中找到的唯一命令:

driver.set_network_conditions(offline=True, latency=5, throughput=500 * 1024)

我测试了这段代码,当我运行脚本时它没有任何效果。网站还在“自动刷新”...

最佳答案

"for example this one: https://gatehub.net/markets/XRP/USD+rhub8VRN55s94qWKDv6jmDy1pUykJzF3wq (there is no API for this site)"


事实上,存在一个api,但它不是完全公开的。

要将图表的值作为 json 对象获取,您需要构建自定义 URL,例如:

https://api.gatehub.net/rippledata/v2/exchanges/USD+rhub8VRN55s94qWKDv6jmDy1pUykJzF3wq/XRP?descending=true&end=2019-02-06T21:20:00.000Z&limit=400&reduce=false&result=tesSUCCESS&start=2009-02-06T21:20:00.000Z

输出:

{"result":"success","count":400,"marker":"USD|rhub8VRN55s94qWKDv6jmDy1pUykJzF3wq|XRP||20190206014150|000044926668|00006|00003","exchanges":[{"base_amount":"0.12180204","counter_amount":"0.42056","node_index":6,"rate":"3.4528157","tx_index":18,"autobridged_currency":"ETH","autobridged_issuer":"rcA8X3TVMST1n3CJeAdGk1RdRCHii7N2h","buyer":"rGmGFAEx1hYEJuSAfrjEBdA48AXWJBMp1D","executed_time":"2019-02-06T21:14:00Z","ledger_index":44945715,"offer_sequence":39832,"provider":"rGmGFAEx1hYEJuSAfrjEBdA48AXWJBMp1D","seller":"rUmnnszuTRfhKnULCjcKzV7mJeazCF7Gik","taker":"rUmnnszuTRfhKnULCjcKzV7mJeazCF7Gik","tx_hash":"4E39DB1CB68B4635E773082042B47168094852ED4A11C93AED7F85A67F1F7EDD","tx_type":"OfferCreate","base_currency":"USD","base_issuer":"rhub8VRN55s94qWKDv6jmDy1pUykJzF3wq","counter_currency":"XRP"},{"base_amount":"322.8872040048709","counter_amount":"1109.37944","node_index":2,"rate":"3.4358111","tx_index":18,"autobridged_currency":"ETH","autobridged_issuer":"rcA8X3TVMST1n3CJeAdGk1RdRCHii7N2h","buyer":"rETx8GBiH6fxhTcfHM9fGeyShqxozyD3xe","executed_time":"2019-02-06T21:14:00Z","ledger_index":44945715,"offer_sequence":26918939,"provider":"rETx8GBiH6fxhTcfHM9fGeyShqxozyD3xe","seller":"rUmnnszuTRfhKnULCjcKzV7mJeazCF7Gik","taker":"rUmnnszuTRfhKnULCjcKzV7mJeazCF7Gik","tx_hash":"4E39DB1CB68B4635E773082042B47168094852ED4A11C93AED7F85A67F1F7EDD","tx_type":"OfferCreate","base_currency":"USD","base_issuer":"rhub8VRN55s94qWKDv6jmDy1pUykJzF3wq","counter_currency":"XRP"}

...

注意事项:

  • 您可以更改limit 参数以显示不同数量的需要时记录(最多测试 400 个)
  • 日期也应该自动更新以获得最新值。

关于javascript - Python selenium 网页抓取 - 卡住页面,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/54561652/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com