gpt4 book ai didi

c++ - 为什么非侵入式序列化会添加 5 字节零前缀?

转载 作者:搜寻专家 更新时间:2023-10-31 00:27:49 26 4
gpt4 key购买 nike

我正在研究使用 boost::archive 的应用程序中从非标准字符串到标准字符串的端口。非标准字符串具有以非侵入式样式定义的(反)序列化,如下例所示。序列化和反序列化按预期工作,但是当移植的应用程序收到一条旧消息时,它会因分配错误而崩溃。这是由于在字符串大小之前插入了 5 个字节(全为零)造成的。

是什么导致插入这 5 个额外的字节?这是某种魔法标记吗?

例子:

#include <iostream>
#include <string>
#include <sstream>
#include <boost/serialization/split_free.hpp>
#include <boost/archive/binary_oarchive.hpp>

struct own_string { // simplified custom string class
std::string content;
};

namespace boost
{
namespace serialization
{
template<class Archive>
inline void save(
Archive & ar,
const own_string & t,
const unsigned int /* file_version */)
{
size_t size = t.content.size();
ar << size;
ar.save_binary(&t.content[0], size);
}

template<class Archive>
inline void load(
Archive & ar,
own_string & t,
const unsigned int /* file_version */)
{
size_t size;
ar >> size;
t.content.resize(size);
ar.load_binary(&t.content[0], size);
}

// split non-intrusive serialization function member into separate
// non intrusive save/load member functions
template<class Archive>
inline void serialize(
Archive & ar,
own_string & t,
const unsigned int file_version)
{
boost::serialization::split_free(ar, t, file_version);
}

} // namespace serialization
} // namespace boost

std::string string_to_hex(const std::string& input)
{
static const char* const lut = "0123456789ABCDEF";
size_t len = input.length();

std::string output;
output.reserve(2 * len);
for (size_t i = 0; i < len; ++i)
{
const unsigned char c = input[i];
output.push_back(lut[c >> 4]);
output.push_back(lut[c & 15]);
}
return output;
}

void test_normal_string()
{
std::stringstream ss;
boost::archive::binary_oarchive ar{ss};

std::string test = "";

std::cout << string_to_hex(ss.str()) << std::endl;
ar << test;

//adds 00 00 00 00 00 00 00 00
std::cout << string_to_hex(ss.str()) << std::endl;
}

void test_own_string()
{
std::stringstream ss;
boost::archive::binary_oarchive ar{ss};

std::string test = "";

own_string otest{test};
std::cout << string_to_hex(ss.str()) << std::endl;
ar << otest;

//adds 00 00 00 00 00 00 00 00 00 00 00 00 00
std::cout << string_to_hex(ss.str()) << std::endl;
}

int main()
{
test_normal_string();
test_own_string();
}

最佳答案

因此,您希望将先前序列化的 own_string 反序列化为 std::string

来自 boost(1.65.1) doc :

By default, for each class serialized, class information is written to the archive. This information includes version number, implementation level and tracking behavior. This is necessary so that the archive can be correctly deserialized even if a subsequent version of the program changes some of the current trait values for a class. The space overhead for this data is minimal. There is a little bit of runtime overhead since each class has to be checked to see if it has already had its class information included in the archive. In some cases, even this might be considered too much. This extra overhead can be eliminated by setting the implementation level class trait to: boost::serialization::object_serializable.

现在,可能(*)这是标准类的默认值。事实上,添加

BOOST_CLASS_IMPLEMENTATION(own_string, boost::serialization::object_serializable)

在全局范围内使 test_X_string 结果相同字节。这应该可以解释观察到的额外字节差异。

也就是说,我没有找到关于标准类序列化特征的任何具体保证(其他人可能比我更了解)。

(*) 实际上是 section about portability of traits settings提到:

Another way to avoid this problem is to assign serialization traits to all specializations of the template my_wrapper for all primitive types so that class information is never saved. This is what has been done for our implementation of serializations for STL collections

因此这可能会给您足够的信心,让您相信标准集合(因此包括 std::string)在这种情况下会给出相同的字节。

关于c++ - 为什么非侵入式序列化会添加 5 字节零前缀?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/47608776/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com