c++ - 我可以安全地将 std::string 用于 C++11 中的二进制数据吗？-6ren

c++ - 我可以安全地将 std::string 用于 C++11 中的二进制数据吗？

转载作者：IT老高更新时间：2023-10-28 21:57:25

48

4

互联网上有几篇帖子建议您应该使用 std::vector<unsigned char>或类似的二进制数据。

但我更喜欢 std::basic_string变体，因为它提供了许多方便的字符串操作功能。而且 AFAIK，从 C++11 开始，该标准保证了每个已知的 C++03 实现已经做了:std::basic_string将其内容连续存储在内存中。

乍一看，std::basic_string<unsigned char>可能是个不错的选择。

我不想使用 std::basic_string<unsigned char>但是，因为几乎所有操作系统功能都只接受 char* ，使显式强制转换成为必要。此外，字符串文字是 const char* ，所以我需要显式转换为 const unsigned char*每次我为我的二进制字符串分配一个字符串文字时，我也想避免这种情况。此外，读取和写入文件或网络缓冲区的函数同样接受 char*和 const char*指针。

这就离开了std::string ，它基本上是 std::basic_string<char> 的 typedef .

使用 std::string 唯一可能存在的问题(我可以看到)对于二进制数据是 std::string使用 char (可以签名)。

char , signed char , 和 unsigned char是三种不同的类型，char可以是未签名的或已签名的。

所以，当实际字节值为 11111111b从 std::string:operator[] 返回作为 char，如果你想检查它的值，它的值可以是 255 (如果 char 未签名)或者它可能是“负数”(如果 char 已签名，取决于您的数字表示)。

同样，如果您想显式附加实际字节值 11111111b到 std::string , 只需附加 (char) (255)如果 char 可能是实现定义的(甚至发出信号)已签署，int至char对话导致溢出。

那么，有没有一种安全的方法来解决这个问题，使 std::string又是二进制安全的？

§3.10/15 规定:

If a program attempts to access the stored value of an object through a glvalue of other than one of the following types the behavior is undefined:

[...]

a type that is the signed or unsigned type corresponding to the dynamic type of the object,

[...]

a char or unsigned char type.

如果我理解正确的话，它似乎允许使用 unsigned char*访问和操作 std::string 内容的指针并使其也定义明确。它只是重新解释位模式为 unsigned char ，没有任何变化或信息丢失，后者即因为 char 中的所有位, signed char , 和 unsigned char必须用于值表示。

然后我可以使用 unsigned char* std::string的内容解读作为访问和更改 [0, 255] 中的字节值的一种方式范围，以明确定义和可移植的方式，与 char 的符号无关自己。

这应该可以解决由可能已签名的 char 引起的任何问题。 .

我的假设和结论正确吗？

还有，unsigned char*相同位模式的解释(即 11111111b 或 10101010b )保证在所有实现上都相同？换句话说，标准是否保证“通过 unsigned char 的眼睛看”，相同的位模式总是导致相同的数值(假设一个字节中的位数相同)？

我可以因此安全地(即没有任何未定义或实现定义的行为)使用std::string吗？用于在 C++11 中存储和操作二进制数据？

最佳答案

转换 static_cast<char>(uc)在哪里 uc类型为 unsigned char始终有效:根据 3.9.1 [basic.fundamental] char 的表示, signed char , 和 unsigned char与 char 相同与其他两种类型之一相同:

Objects declared as characters (char) shall be large enough to store any member of the implementation’s basic character set. If a character from this set is stored in a character object, the integral value of that character object is equal to the value of the single character literal form of that character. It is implementation-defined whether a char object can hold negative values. Characters can be explicitly declared unsigned or signed. Plain char, signed char, and unsigned char are three distinct types, collectively called narrow character types. A char, a signed char, and an unsigned char occupy the same amount of storage and have the same alignment requirements (3.11); that is, they have the same object representation. For narrow character types, all bits of the object representation participate in the value representation. For unsigned narrow character types, all possible bit patterns of the value representation represent numbers. These requirements do not hold for other types. In any particular implementation, a plain char object can take on either the same values as a signed char or an unsigned char; which one is implementation-defined.

转换 unsigned char 范围之外的值至char当然，这将是有问题的，并可能导致未定义的行为。也就是说，只要您不尝试将有趣的值存储到 std::string你会没事的。关于位模式，您可以依赖 n翻译成 2^{n 的位}。将二进制数据存储在 std::string 中应该没有问题。小心处理。

也就是说，我不相信你的前提:处理二进制数据主要需要处理最好使用 unsigned 处理的字节。值(value)观。您需要在 char* 之间转换的少数情况和 unsigned char*在弄乱 char 的使用时未明确处理时会产生方便的错误一不小心就会沉默!即处理unsigned char将防止错误。我也不相信您会获得所有这些不错的字符串函数的前提:首先，您通常最好还是使用算法，但二进制数据也是 not 字符串数据。综上所述:std::vector<unsigned char> 的推荐不只是凭空而来!刻意避免在设计中设置难以发现的陷阱!

支持使用 char 的唯一合理合理的论据可能是关于字符串文字的，但即使这样也不能使用引入到 C++11 中的用户定义的字符串文字:

#include <cstddef>
unsigned char const* operator""_u (char const* s, size_t) 
{
    return reinterpret_cast<unsigned char const*>(s);
}

unsigned char const* hello = "hello"_u;

关于c++ - 我可以安全地将 std::string 用于 C++11 中的二进制数据吗？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/19757653/

48

4

0

文章推荐： c++ - Windows下线程创建和终止需要多长时间？

文章推荐： android - 根据文本区域的宽度计算文本大小

文章推荐： android - 如何以编程方式在 Android 中打开/关闭通知？

c++ - 为什么 `std::common_type_t` 等于 `std::ostream` 而不是 `std::ostream &` ？
我正在开发一个小型图书馆，我需要做的一件事是让访问者访问一些数据并返回结果。在一些较旧的 C++ 代码中，访问者需要声明一个 typedef return_type .例如，boost::stati
c++ - std::map 麻烦
我正在尝试使用std:map类型的键和值制作std::any Visual Studio 2017 std::map m("lastname", "Ivanov"); std::cout (m["la
C++ std::map> 。如何循环设定值？
我已经在 C++ 的 map 中声明了一个集合为 std::map> .如何循环访问或打印设定值？最佳答案如果你知道如何迭代 std::map或 std::set单独地，您应该可以毫无问题地组合迭
C++ 循环 std::vector>
如何循环？我已经试过了: //----- code std::vector >::iterator it; for ( it = users.begin(); it != users.end();
c++ - std::unique_lock 还是 std::lock_guard？
我有两个用例。 A.我想同步访问两个线程的队列。 B.我想同步两个线程对队列的访问并使用条件变量，因为其中一个线程将等待另一个线程将内容存储到队列中。对于用例 A，我看到了使用 std::lock_
c++ - std::trivially_copyable_v 和 std::is_pod_v 之间有什么区别(std::is_standard_layout && std::is_trivial_v)
我正在查看这两种类型特征的文档，但不确定有什么区别。我不是语言律师，但据我所知，它们都适用于“memcpy-able”类型。它们可以互换使用吗？最佳答案不，这些术语不能互换使用。这两个术语都表示
c++ - 为什么我可以有一个 std::vector 而不是 std::vector？
我有以下测试代码，其中有一个参数 fS，它是 ofstream 的容器: #include #include #include #include int
c++ - std::unordered_map
这是这个问题的延续 c++ function ptr in unorderer_map, compile time error 我试图使用 std::function 而不是函数指针，并且只有当函数是

c++ - 将 std::any_of、std::all_of、std::none_of 等与 std::map 一起使用
std::unordered_map str_bool_map = { {"a", true}, {"b", false}, {"c", true} }; 我们可以在此映射上使
c++ - 使用 std::find 检查 std::vector> 中的项目
我有以下对象 std::vector> vectorList; 然后我添加到这个使用 std::vector vec_tmp; vec_tmp.push_back(strDRG); vec_tmp.p
c++ - 为什么 std::initializer_list 不支持 std::get<>、std::tuple_size 和 std::tuple_element
为什么 std::initializer_list不支持std::get<> , std::tuple_size和 std::tuple_element ？在constexpr中用得很多现在的表达式，
c++ - std::tuple 和 std::tuple 是否被 std::vector 视为同一类型？
我有一个像这样定义的变量 auto drum = std::make_tuple ( std::make_tuple ( 0.3f , Ex
c++ :将 std::map 转换为 std::map
假设我有一个私有(private)std::map在我的类(class)里std::map 。我怎样才能将其转换为std::map返回给用户？我想要下面的原型(prototype) const std
c++ :将 std::map 转换为 std::map
假设我有一个私有(private)std::map在我的类(class)里std::map 。我怎样才能将其转换为std::map返回给用户？我想要下面的原型(prototype) const std
c++ - 在带有 std::ref 的 std::thread 中使用地址清理调用 std::invoke(std::forward(...)) 时的奇怪行为
问题我正在尝试将 lambda 闭包传递给 std::thread，它使用任意封闭参数调用任意封闭函数。 template std::thread timed_thread(Function&& f
c++ - 具有模板模板参数的模板定义，可以专门化为类，例如，std::vector 或 std::map
我想创建一个模板类，可以容纳容器和容器的任意组合。例如，std::vector或 std::map ，例如。我尝试了很多组合，但我必须承认模板的复杂性让我不知所措。我编译的关闭是这样的: templ
c++ - 将 std::vector> 分配给另一个 std::vector>
我有一个 std::vector>我将其分配给相同类型的第二个 vector 。我收到这个编译器错误: /opt/gcc-8.2.0/include/c++/8.2.0/bits/stl_algob
c++ - 将 std::vector> 移动到 std::vector>
有时候，我们有一个工厂可以生成一个 std::unique_ptr vector ，后来我们想在类/线程/你命名的之间共享这些指针。因此，最好改用 std::shared_ptr 。当然有一种方法可以
c++ - 为什么 std::sort 假定 std::vector< std::vector> 默认为 std::vector，从而产生错误的结果？
这个问题在这里已经有了答案: Sorting a vector of custom objects (14 个答案) 关闭 6 年前。我创建了一个 vector vector ，我想根据我定义的参
c++ - 将 std::vector> 转换为 std::vector>
我有三个类(class)成员: public: std::vector > getObjects(); std::vector > getObjects() const; privat

首页

博学

6Ren·AI

商城

c++ - 我可以安全地将 std::string 用于 C++11 中的二进制数据吗？