gpt4 book ai didi

url - 我应该假设 URL 中的编码字符使用什么字符集?

转载 作者:行者123 更新时间:2023-12-03 10:29:23 25 4
gpt4 key购买 nike

RFC 1738指定 URL 的语法,并提到

URLs are written only with the graphic printable characters of the
US-ASCII coded character set. The octets 80-FF hexadecimal are not
used in US-ASCII, and the octets 00-1F and 7F hexadecimal represent
control characters; these must be encoded.



然而,它没有说明这些八位字节代表什么代码集。

RFC 2396似乎试图改善这种情况,但是:

For original character sequences that contain non-ASCII characters, however, the situation is more difficult. Internet protocols that transmit octet sequences intended to represent character sequences are expected to provide some way of identifying the charset used, if there might be more than one [RFC2277]. However, there is currently no provision within the generic URI syntax to accomplish this identification. An individual URI scheme may require a single charset, define a default charset, or provide a way to indicate the charset used.

It is expected that a systematic treatment of character encoding within URI will be developed as a future modification of this specification.



是否有任何明确的方式让客户端可以确定使用哪个字符集来解释编码的八位字节,或者服务器可以确定客户端用来编码的内容?

在我看来,大多数服务器都默认使用 UTF-8,但这似乎是一个事实上的选择,而不是指定的选择。

最佳答案

根据您的报价,URL 是 ASCII。就这样。

URIs OTOH,允许更大的字符集;通常是你自己说的 UTF-8。

要记住的一点是 URL 是 URI 的子集。因此,真正的问题是,其中哪些是您在浏览器中编写的内容?

我猜你可以写一个 URI,浏览器应该尽量转换成一个 URL(这是 HTTP/1.1 支持的,AFAICR)。对于非 ASCII 字符,这意味着十六进制代码,通常编码 UTF-8。

关于url - 我应该假设 URL 中的编码字符使用什么字符集?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/140549/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com