gpt4 book ai didi

连接到同一主机时,Python requests 和 urllib2 获得不同的 header

转载 作者:太空宇宙 更新时间:2023-11-03 15:47:14 25 4
gpt4 key购买 nike

我们有一个提供 .txt 文件的服务器,基本上是一些随着时间的推移而增长的日志文件。当我使用 urllib2GET 发送到服务器 r = urllib2.urlopen('http://example.com') 时, header 响应将是:

Date: XXX
Server: Apache
Last-Modified: XXX
Accept-Ranges: bytes
Content-Length: 12345678
Vary: Accept-Encoding
Connection: close
Content-Type: text/plain

如果r = requests.get('http://example.com'):

Content-Encoding: gzip
Accept-Ranges: bytes
Vary: Accept-Encoding
Keep-alive: timeout=5, max=128
Last-Modified: XXX
Connection: Keep-Alive
ETag: xxxxxxxxx
Content-Type: text/plain

第二个响应与我使用 chrome 开发工具得到的响应相同。那么为什么两者不同呢?我需要 Content-Length header 来确定每次需要下载多少字节,因为文件可能会变得非常大。

编辑:使用httpbin.org/get来测试:

urllib2 响应:

{u'args': {},
u'headers': {u'Accept-Encoding': u'identity',
u'Host': u'httpbin.org',
u'User-Agent': u'Python-urllib/2.7'},
u'origin': u'ip',
u'url': u'http://httpbin.org/get'}

响应 header :

Server: nginx
Date: Sat, 14 Jan 2017 07:41:16 GMT
Content-Type: application/json
Content-Length: 207
Connection: close
Access-Control-Allow-Origin: *
Access-Control-Allow-Credentials: true

请求响应:

{u'args': {},
u'headers': {u'Accept': u'*/*',
u'Accept-Encoding': u'gzip, deflate',
u'Host': u'httpbin.org',
u'User-Agent': u'python-requests/2.11.1'},
u'origin': u'ip',
u'url': u'http://httpbin.org/get'}

响应 header :

Server : nginx
Date : Sat, 14 Jan 2017 07:42:39 GMT
Content-Type : application/json
Content-Length : 239
Connection : keep-alive
Access-Control-Allow-Origin : *
Access-Control-Allow-Credentials : true

最佳答案

引自Lukasa在 github 上:

The response is different because requests indicates that it supports gzip-encoded bodies, by sending an Accept-Encoding: gzip, deflate header field. urllib2 does not. You'll find if you added that header to your urllib2 request that you get the new behaviour.

Clearly, in this case, the server is dynamically gzipping the responses. This means it doesn't know how long the response will be, so it is sending using chunked transfer encoding.

If you really must get the Content-Length header, then you should add the following headers to your Requests request: {'Accept-Encoding': 'identity'}.

关于连接到同一主机时,Python requests 和 urllib2 获得不同的 header ,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/41647673/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com