gpt4 book ai didi

sql - oracle SQL字符串中ASCII 255的含义

转载 作者:行者123 更新时间:2023-12-02 03:28:46 30 4
gpt4 key购买 nike

SELECT LENGTH('*'||CHR(255)||CHR(255)||'$')
FROM DUAL;

此查询给出的输出是 2 而不是 4 ?

但是

SELECT LENGTH(CHR(255)||CHR(255))
FROM DUAL;

此查询的输出为 null。这意味着 CHR(255)||CHR(255) 表示的字符串是长度为 0 的空字符串。

CHR(255) 如何影响长度?

最佳答案

连接字符串中的 chr(255) 被视为 null,它没有长度,因此只计算其他非空字符 - 因此它得到 2 而不是 4。

ASCII 并没有真正达到 255,而且您也没有真正处理 ASCII。您的数据库字符集(大概)是 AL32UTF8,这是一个多字节字符集。来自 FileFormat.Info's summary :

For any character equal to or below 127 (hex 0x7F), the UTF-8 representation is one byte. It is just the lowest 7 bits of the full unicode value. This is also the same as the ASCII value.

For characters equal to or below 2047 (hex 0x07FF), the UTF-8 representation is spread across two bytes. The first byte will have the two high bits set and the third bit clear (i.e. 0xC2 to 0xDF). The second byte will have the top bit set and the second bit clear (i.e. 0x80 to 0xBF).

来自文档 for chr() :

For multibyte character sets, n must resolve to one entire code point. Invalid code points are not validated, and the result of specifying invalid code points is indeterminate.

对于 UTF8,没有完整的代码点 255/FF,因此 chr(255) 无效。事实上,according to the spec ,没有带有 FF 八位字节的代码点。

可能期望它被渲染为“ÿ”;如果您使用有效的编码,如 AL16UTF16:

select chr(255 using nchar_cs), dump(chr(255 using nchar_cs), 1016) as chr_dump,
unistr('\00ff'), dump(unistr('\00ff'), 1016) as unistr_dump
from dual;

C CHR_DUMP U UNISTR_DUMP
- ---------------------------------------- - --------------------------------------------------
ÿ Typ=1 Len=2 CharacterSet=AL16UTF16: 0,ff ÿ Typ=1 Len=2 CharacterSet=AL16UTF16: 0,ff

但由于 UTF8 的编码方式(以及 127 以上的所有内容)实际上是多个字节,C3BF

更有趣的是 Oracle 如何处理该无效字符。就其本身而言,您可以看到它存在并且无效,但是当它与另一个(有效或无效)字符连接时,它基本上被忽略:

with t (descr, str) as (
select 'chr(255)', chr(255) from dual
union all select 'chr(255)||chr(255)', chr(255)||chr(255) from dual
union all select q'['*'||chr(255)]', '*'||chr(255) from dual
union all select q'[chr(255)||'$']', chr(255)||'$' from dual
union all select q'['*'||chr(255)||'$']', '*'||chr(255)||'$' from dual
union all select q'['*'||chr(255)||'$'||chr(255)]', '*'||chr(255)||'$'||chr(255) from dual
union all select q'[chr(255)||'*'||chr(255)||'$']', chr(255)||'*'||chr(255)||'$' from dual
union all select q'['*'||chr(255)||chr(255)||'$']', '*'||chr(255)||chr(255)||'$' from dual
union all select q'['ÿ']', 'ÿ' from dual
union all select 'chr(127)||chr(127)', chr(127)||chr(127) from dual
union all select 'chr(127)||chr(128)', chr(127)||chr(128) from dual
union all select 'chr(128)||chr(127)', chr(128)||chr(127) from dual
union all select 'chr(128)||chr(128)', chr(128)||chr(128) from dual
)
select descr, str, dump(str, 1016) as str_dump, length(str) as str_length
from t;

DESCR ST STR_DUMP STR_LENGTH
---------------------------- -- -------------------------------------------------- ----------
chr(255) ? Typ=1 Len=1 CharacterSet=AL32UTF8: ff 1
chr(255)||chr(255) NULL
'*'||chr(255) * Typ=1 Len=1 CharacterSet=AL32UTF8: 2a 1
chr(255)||'$' $ Typ=1 Len=1 CharacterSet=AL32UTF8: 24 1
'*'||chr(255)||'$' *$ Typ=1 Len=2 CharacterSet=AL32UTF8: 2a,24 2
'*'||chr(255)||'$'||chr(255) *$ Typ=1 Len=2 CharacterSet=AL32UTF8: 2a,24 2
chr(255)||'*'||chr(255)||'$' *$ Typ=1 Len=2 CharacterSet=AL32UTF8: 2a,24 2
'*'||chr(255)||chr(255)||'$' *$ Typ=1 Len=2 CharacterSet=AL32UTF8: 2a,24 2
'ÿ' ÿ Typ=1 Len=2 CharacterSet=AL32UTF8: c3,bf 1
chr(127)||chr(127) Typ=1 Len=2 CharacterSet=AL32UTF8: 7f,7f 2
chr(127)||chr(128) Typ=1 Len=1 CharacterSet=AL32UTF8: 7f 1
chr(128)||chr(127) Typ=1 Len=1 CharacterSet=AL32UTF8: 7f 1
chr(128)||chr(128) NULL

最后几个示例表明,这并非特定于 255,任何高于 127 的值都会出现问题,因为 UTF8 从 127/7F(仍然是一个字节)跳转到 128/C280(两个字节)。 (例如,您可以看到跳转 here 。)

这里有一个快速演示,连接使用 128-255 形成的任何无效字符将被视为 null,无论它与什么连接:

with t (n) as (
select level from dual connect by level <= 255
)
select count(*), min(t1.n), max(t1.n), min(t1.n), max(t2.n)
from t t1
cross join t t2
where chr(t1.n)||chr(t2.n) is null
order by t1.n, t2.n;

COUNT(*) MIN(T1.N) MAX(T1.N) MIN(T1.N) MAX(T2.N)
---------- ---------- ---------- ---------- ----------
16384 128 255 128 255

关于sql - oracle SQL字符串中ASCII 255的含义,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/52439654/

30 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com