gpt4 book ai didi

python-3.x - 如何构建 Etherscan 网络爬虫?

转载 作者:行者123 更新时间:2023-12-05 02:07:15 27 4
gpt4 key购买 nike

<分区>

我正在构建一个 webscraper,它每 30 秒不断刷新一堆 etherscan URL,如果发生任何未说明的新传输,它会向我发送一封电子邮件通知和指向 etherscan 上相关地址的链接,因此我可以手动检查它们。

我想密切关注的地址之一在这里:

https://etherscan.io/token/0xd6a55c63865affd67e2fb9f284f87b7a9e5ff3bd?a=0xd071f6e384cf271282fc37eb40456332307bb8af

到目前为止我做了什么:

from urllib.request import Request, urlopen
url = 'https://etherscan.io/token/0xd6a55c63865affd67e2fb9f284f87b7a9e5ff3bd?a=0x94f52b6520804eced0accad7ccb93c73523af089'
req = Request(url, headers={'User-Agent': 'XYZ/3.0'}) # I got this line from another post since "uClient = uReq(URL)" and "page_html = uClient.read()" would not work (I beleive that etherscan is attemption to block webscraping or something?)
response = urlopen(req, timeout=20).read()
response_close = urlopen(req, timeout=20).close()
page_soup = soup(response, "html.parser")
Transfers_info_table_1 = page_soup.find("div", {"class": "table-responsive"})
print(Transfers_info_table_1)

有趣的是,当我运行它时,我得到以下输出:

<div class="table-responsive" style="visibility:hidden;">
<iframe frameborder="0" id="tokentxnsiframe" scrolling="no" src="" style="width: 100px; height: 600px; min-width: 100%;"></iframe>
</div>

我期望得到整个转帐表的输出。我在这里做错了什么?

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com