gpt4 book ai didi

.net - HttpUtility.UrlEncode是否符合 'x-www-form-urlencoded'的规范?

转载 作者:行者123 更新时间:2023-12-04 04:19:30 25 4
gpt4 key购买 nike

Per MSDN

URLEncode converts characters as follows:

  • Spaces ( ) are converted to plus signs (+).
  • Non-alphanumeric characters are escaped to their hexadecimal representation.


W3C类似但不完全相同

application/x-www-form-urlencoded

This is the default content type. Forms submitted with this content type must be encoded as follows:

  1. Control names and values are escaped. Space characters are replaced by '+', and then reserved characters are escaped as described in RFC1738, section 2.2: Non-alphanumeric characters are replaced by '%HH', a percent sign and two hexadecimal digits representing the ASCII code of the character. Line breaks are represented as "CR LF" pairs (i.e., '%0D%0A').

  2. The control names/values are listed in the order they appear in the document. The name is separated from the value by '=' and name/value pairs are separated from each other by '&'.



我的问题是,有没有人做过确定URLEncode是否产生有效的x-www-form-urlencoded数据的工作?

最佳答案

好了,您链接到的文档是针对IIS 6 Server.UrlEncode的,但是您的标题似乎是关于.NET System.Web.HttpUtility.UrlEncode的。使用Reflector之类的工具,我们可以看到后者的实现并确定它是否符合W3C规范。

这是最终被调用的编码例程(请注意,它是为字节数组定义的,而其他占用字符串的重载最终会将这些字符串转换为字节数组并调用此方法)。您将为每个控件名称和值调用此函数(以避免转义用作分隔符的保留字符= &)。

protected internal virtual byte[] UrlEncode(byte[] bytes, int offset, int count)
{
if (!ValidateUrlEncodingParameters(bytes, offset, count))
{
return null;
}
int num = 0;
int num2 = 0;
for (int i = 0; i < count; i++)
{
char ch = (char) bytes[offset + i];
if (ch == ' ')
{
num++;
}
else if (!HttpEncoderUtility.IsUrlSafeChar(ch))
{
num2++;
}
}
if ((num == 0) && (num2 == 0))
{
return bytes;
}
byte[] buffer = new byte[count + (num2 * 2)];
int num4 = 0;
for (int j = 0; j < count; j++)
{
byte num6 = bytes[offset + j];
char ch2 = (char) num6;
if (HttpEncoderUtility.IsUrlSafeChar(ch2))
{
buffer[num4++] = num6;
}
else if (ch2 == ' ')
{
buffer[num4++] = 0x2b;
}
else
{
buffer[num4++] = 0x25;
buffer[num4++] = (byte) HttpEncoderUtility.IntToHex((num6 >> 4) & 15);
buffer[num4++] = (byte) HttpEncoderUtility.IntToHex(num6 & 15);
}
}
return buffer;
}

public static bool IsUrlSafeChar(char ch)
{
if ((((ch >= 'a') && (ch <= 'z')) || ((ch >= 'A') && (ch <= 'Z'))) || ((ch >= '0') && (ch <= '9')))
{
return true;
}
switch (ch)
{
case '(':
case ')':
case '*':
case '-':
case '.':
case '_':
case '!':
return true;
}
return false;
}

例程的第一部分计算需要替换的字符数(空格和非URL安全字符)。例程的第二部分分配一个新的缓冲区并执行替换:
  • 网址安全字符按原样保留:a-z A-Z 0-9 ()*-._!
  • 将空格转换为加号
  • 所有其他字符都将转换为%HH

  • RFC1738状态(强调我的):

    Thus, only alphanumerics, the special characters "$-_.+!*'(),", and
    reserved characters used for their reserved purposes may be used
    unencoded within a URL.

    On the other hand, characters that are not required to be encoded
    (including alphanumerics) may be encoded within the scheme-specific
    part of a URL, as long as they are not being used for a reserved
    purpose.


    UrlEncode允许的一组URL安全字符是RFC1738中定义的特殊字符的子集。也就是说,即使规范说它们安全,也会缺少字符 $,并由 UrlEncode编码。由于可以不经编码(而不是必须)使用它们,因此它仍然符合对它们进行编码的规范(第二段明确指出了这一点)。

    关于换行符,如果输入具有 CR LF序列,则将转义 %0D%0A。但是,如果输入只有 LF,则将转义 %0A(因此,此例程中没有换行符的规范化处理)。

    底线:符合规范,同时还对 $,进行编码,并且调用方负责在输入中提供适当的标准化换行符。

    关于.net - HttpUtility.UrlEncode是否符合 'x-www-form-urlencoded'的规范?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/3208555/

    25 4 0
    Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
    广告合作:1813099741@qq.com 6ren.com