gpt4 book ai didi

c++ - 为什么 unicode 字符在 C++ std::string 中被同等对待?

转载 作者:行者123 更新时间:2023-11-30 01:49:52 24 4
gpt4 key购买 nike

这是一个 Ideone:http://ideone.com/vjByty .

#include <iostream>
using namespace std;
#include <string>

int main() {
string s = "\u0001\u0001";
cout << s.length() << endl;
if (s[0] == s[1]) {
cout << "equal\n";
}
return 0;
}

我在很多层面上都感到困惑。

当我在 C++ 程序中键入转义的 Unicode 字符串文字时,这意味着什么?

2个字符不应该占用4个字节吗? (假设为 utf-16)

为什么s的前两个字符(前两个字节)相等?

最佳答案

因此,C++11 标准草案对窄字符串文字中的通用字符作了以下说明(强调我的 future ):

Escape sequences and universal-character-names in non-raw string literals have the same meaning as in character literals (2.14.3), except that the single quote [...] In a narrow string literal, a universal-charactername may map to more than one char element due to multibyte encoding

并包括以下注释:

The size of a narrow string literal is the total number of escape sequences and other characters, plus at least one for the multibyte encoding of each universal-character-name, plus one for the terminating ’\0’.

上面提到的 2.14.3 部分说:

A universal-character-name is translated to the encoding, in the appropriate execution character set, of the character named. If there is no such encoding, the universal-character-name is translated to an implementation defined encoding.

如果我尝试这个示例 ( see it live ):

string s = "\u0F01\u0001";

第一个通用字符确实映射到多个字符。

关于c++ - 为什么 unicode 字符在 C++ std::string 中被同等对待?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/28351017/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com