gpt4 book ai didi

Python:为什么 “~” 现在包含在 urllib.parse.quote() 的保留字符集中?

转载 作者:太空狗 更新时间:2023-10-29 18:09:16 24 4
gpt4 key购买 nike

recent documentation for urllib状态:

Changed in version 3.7: Moved from RFC 2396 to RFC 3986 for quoting URL strings. “~” is now included in the set of reserved characters.

为什么会这样?在 RFC 3986 , ~ 不是保留字符:

 reserved    = gen-delims / sub-delims

gen-delims = ":" / "/" / "?" / "#" / "[" / "]" / "@"

sub-delims = "!" / "$" / "&" / "'" / "(" / ")"
/ "*" / "+" / "," / ";" / "="

明确在 the next section 中它作为非保留字符包含在内:

2.3. Unreserved Characters

Characters that are allowed in a URI but do not have a reservedpurpose are called unreserved. These include uppercase and lowercaseletters, decimal digits, hyphen, period, underscore, and tilde.

 unreserved  = ALPHA / DIGIT / "-" / "." / "_" / "~"

此外,later on ,RFC 指出(强调我的):

For example, the octet corresponding to the tilde ("~") character is often encoded as "%7E" by older URI processing implementations;

所以看起来 3.7 是不一致的:它断言支持更新的 RFC,同时回归 ~ 的处理。 (事实上​​ ,在 older RFC 中, ~ 也没有被保留,也没有 ' unwise ')

最佳答案

此错误已在 https://bugs.python.org/issue16285 中跟踪并关闭

事实上,最新版本的代码反射(reflect)了这些变化。

引用 https://github.com/python/cpython/blob/master/Lib/urllib/parse.py

_ALWAYS_SAFE = frozenset(b'ABCDEFGHIJKLMNOPQRSTUVWXYZ'
b'abcdefghijklmnopqrstuvwxyz'
b'0123456789'
b'_.-~')

关于Python:为什么 “~” 现在包含在 urllib.parse.quote() 的保留字符集中?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/51334226/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com