gpt4 book ai didi

python - 能不能只获取网页的header信息,不获取body? ( Mechanize )

转载 作者:太空宇宙 更新时间:2023-11-03 14:34:57 26 4
gpt4 key购买 nike

如果我只需要下载自上次下载以来未发生变化的页面怎么办?什么是最好的方法?我可以先获取页面的大小,然后比较决定它是否已更改,如果是,我要求下载否则跳过?

我打算使用 (python) Mechanize 。

最佳答案

请求应该是 HEAD ,不是 GET:

9.4 HEAD

The HEAD method is identical to GETexcept that the server MUST NOT returna message-body in the response. Themetainformation contained in the HTTPheaders in response to a HEAD requestSHOULD be identical to the informationsent in response to a GET request.This method can be used for obtainingmetainformation about the entityimplied by the request withouttransferring the entity-body itself.This method is often used for testinghypertext links for validity,accessibility, and recentmodification.

The response to a HEAD request MAY becacheable in the sense that theinformation contained in the responseMAY be used to update a previouslycached entity from that resource. Ifthe new field values indicate that thecached entity differs from the currententity (as would be indicated by achange in Content-Length, Content-MD5,ETag or Last-Modified), then the cacheMUST treat the cache entry as stale.

请参阅此处 How can I perform a HEAD request with the mechanize library

关于python - 能不能只获取网页的header信息,不获取body? ( Mechanize ),我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/2730997/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com