gpt4 book ai didi

python - 解析请求响应时应该使用 .text 还是 .content?

转载 作者:太空狗 更新时间:2023-10-29 21:36:23 25 4
gpt4 key购买 nike

我偶尔会使用res.contentres.text 来解析来自Requests 的响应。 .在我遇到的用例中,我使用哪个选项似乎并不重要。

.content.text 解析 HTML 的主要区别是什么?例如:

import requests 
from lxml import html
res = requests.get(...)
node = html.fromstring(res.content)

在上述情况下,我应该使用res.content 还是res.text?何时使用它们的最佳经验法则是什么?

最佳答案

来自documentation :

When you make a request, Requests makes educated guesses about the encoding of the response based on the HTTP headers. The text encoding guessed by Requests is used when you access r.text. You can find out what encoding Requests is using, and change it, using the r.encoding property:

>>> r.encoding
'utf-8'
>>> r.encoding = 'ISO-8859-1'

If you change the encoding, Requests will use the new value of r.encoding whenever you call r.text. You might want to do this in any situation where you can apply special logic to work out what the encoding of the content will be. For example, HTTP and XML have the ability to specify their encoding in their body. In situations like this, you should use r.content to find the encoding, and then set r.encoding. This will let you use r.text with the correct encoding.

因此,当服务器返回二进制数据或伪造的编码 header 时,将使用 r.content 来尝试在元标记内找到正确的编码。

关于python - 解析请求响应时应该使用 .text 还是 .content?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/40163323/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com