gpt4 book ai didi

c++ - 在二进制存档中使用Boost序列化时出错

转载 作者:行者123 更新时间:2023-12-01 14:25:30 27 4
gpt4 key购买 nike

boost::archive::binary_iarchive读入我的变量时出现以下错误:

test-serialization(9285,0x11c62fdc0) malloc: can't allocate region
*** mach_vm_map(size=18014398509486080) failed (error code=3)
test-serialization(9285,0x11c62fdc0) malloc: *** set a breakpoint in malloc_error_break to debug

我的序列化和反序列化代码为:

template<class Archive>
void save(Archive & archive, const helib::PubKey & pubkey, const unsigned int version){
BOOST_TEST_MESSAGE("inside save_construct_data");
archive << &(pubkey.context);
archive << pubkey.skBounds;
archive << pubkey.keySwitching;
archive << pubkey.keySwitchMap;
archive << pubkey.KS_strategy;
archive << pubkey.recryptKeyID;
}

template<class Archive>
void load_construct_data(Archive & archive, helib::PubKey * pubkey, const unsigned int version){
helib::Context * context = new helib::Context(2,3,1); //random numbers since there is no default constructor
BOOST_TEST_MESSAGE("deserializing context");
archive >> context;
std::vector<double> skBounds;
std::vector<helib::KeySwitch> keySwitching;
std::vector<std::vector<long>> keySwitchMap;
NTL::Vec<long> KS_strategy;
long recryptKeyID;
BOOST_TEST_MESSAGE("deserializing skbounds");
archive >> skBounds;
BOOST_TEST_MESSAGE("deserializing keyswitching");
archive >> keySwitching;
BOOST_TEST_MESSAGE("deserializing keyswitchmap");
archive >> keySwitchMap;
BOOST_TEST_MESSAGE("deserializing KS_strategy");
archive >> KS_strategy;
BOOST_TEST_MESSAGE("deserializing recryptKeyID");
archive >> recryptKeyID;
BOOST_TEST_MESSAGE("new pubkey");
::new(pubkey)helib::PubKey(*context);
//TODO: complete
}

template<class Archive>
void serialize(Archive & archive, helib::PubKey & pubkey, const unsigned int version){
split_free(archive, pubkey, version);
}

template<class Archive>
void load(Archive & archive, helib::PubKey & pubkey, const unsigned int version){
}


调用代码的测试如下:

BOOST_AUTO_TEST_CASE(serialization_pubkey)
{
auto context = helibTestContext();
helib::SecKey secret_key(context);
secret_key.GenSecKey();
// Compute key-switching matrices that we need
helib::addSome1DMatrices(secret_key);
// Set the secret key (upcast: SecKey is a subclass of PubKey)
const helib::PubKey& original_pubkey = secret_key;

std::string filename = "pubkey.serialized";

std::ofstream os(filename, std::ios::binary);
{
boost::archive::binary_oarchive oarchive(os);
oarchive << original_pubkey;
}

helib::PubKey * restored_pubkey = new helib::PubKey(helib::Context(2,3,1));
{
std::ifstream ifs(filename, std::ios::binary);
boost::archive::binary_iarchive iarchive(ifs);
BOOST_TEST_CHECKPOINT("calling deserialization");
iarchive >> restored_pubkey;
BOOST_TEST_CHECKPOINT("done with deserialization");

//tests ommitted
}
}

注意事项:
  • 序列化可以与boost::archive::text_oarchiveboost::archive::binary_oarchive一起使用。他们分别创建了46M和21M的文件(我知道很大)。
  • 使用boost::archive::text_iarchive进行反序列化基本上在执行archive >> keySwitching;时停止了。进程自动终止。实际上,这是存档的最大部分。
  • 我决定尝试使用boost::archive::binary_iarchive,因为该文件只有一半大小,但是出现了开头显示的错误。在执行第一次从归档文件archive >> context;读取时发生错误。
  • 输入和输出(saveload_construct_data)之间的不对称是因为我找不到另一种方法来避免实现对helib::PubKey派生类的序列化。使用指向helib::PubKey的指针给了我编译错误,要求对派生类进行序列化。如果还有其他方法,我会全神贯注。

  • 谢谢您的帮助。

    更新:

    我正在对密码库 HElib中的某些类实现反序列化,因为我需要通过网络发送密文。这些类之一是 helib::PubKey。我正在使用 boost serialization library实现。我创建了一个 gist来提供注释中建议的reprex。有3个文件:
  • serialization.hpp,它包含序列化实现。不幸的是,helib::PubKey依赖于许多其他类,使得文件相当长。其他所有类都有通过的单元测试。 此外,我不得不对类进行微小的修改,以使其序列化。我公开了私有(private)成员
  • test-serialization.cpp,它包含单元测试。
  • Makefile。运行make将创建可执行的测试序列化。
  • 最佳答案

    vector<bool>再次罢工

    它实际上是在我的测试箱上分配0x1fffffffff20000位(即144个petabits)。这直接来自IndexSet::resize()。

    现在,我对在这里使用std::vector<bool>的HElib有严重的疑问(似乎使用boost::icl::interval_set<>这样的东西可以更好地解决这些问题)。
    enter image description here

    好。那是一个疯狂的追求(可以大大改善IndexSet序列化)。但是,真正的问题是您拥有Undefined Behaviour,因为您没有在序列化时反序列化同一类型。

    您序列化了PubKey,但是尝试将其反序列化为PubKey*。哦

    除此之外,还有很多问题:

  • 您必须修改库以将私有(private)成员公开。这很容易违反ODR(使类布局不兼容)。
  • 您似乎将上下文视为“动态”资源,它将使用Object Tracking。这可能是一种可行的方法。但。您必须考虑所有权。

    看来您还没有这样做。例如,load_construct_dataDoublCRT中的行是确定的内存泄漏:
    helib::Context * context = new helib::Context(2,3,1);

    您永远不会使用它,也永远不会释放它。实际上,您只需使用反序列化的实例(可能会或可能不会拥有)来覆盖它。捕获22

    对于load_construct_data来说,在PubKey中完全一样。
  • 更糟糕的是,在save_construct_data中,您完全免费地为每个DoubleCRT中的每个SecKey复制了上下文对象:
     auto context = polynomial->getContext();
    archive << &context;

    因为您将其伪装为指针序列化,所以再次出现了(显然无用的)对象跟踪,这只是意味着您序列化了多余的Context副本,这些副本将被所有进行反序列化泄漏。
  • 我很想假设两者中的上下文实例总是相同的?为什么不无论如何都分别序列化上下文?
  • 实际上,我去分析了HElib源代码以检查这些假设。事实证明我是对的。没有任何东西可以构筑外部环境
    std::unique_ptr<Context> buildContextFromBinary(std::istream& str);
    std::unique_ptr<Context> buildContextFromAscii(std::istream& str);

    如您所见,它们返回拥有的指针。您应该一直在使用它们。也许即使有了内置的序列化,我实际上还是在这里绊倒了。

  • 重组时间

    我会使用HElib的序列化代码(因为,为什么要重新发明轮子并造成大量错误呢?)。如果您 坚持与Boost Serialization集成,则可以吃点蛋糕:
    template <class Archive> void save(Archive& archive, const helib::PubKey& pubkey, unsigned) {
    using V = std::vector<char>;
    using D = iostreams::back_insert_device<V>;
    V data;
    {
    D dev(data);
    iostreams::stream_buffer<D> sbuf(dev);
    std::ostream os(&sbuf); // expose as std::ostream
    helib::writePubKeyBinary(os, pubkey);
    }
    archive << data;
    }

    template <class Archive> void load(Archive& archive, helib::PubKey& pubkey, unsigned) {
    std::vector<char> data;
    archive >> data;
    using S = iostreams::array_source;
    S source(data.data(), data.size());
    iostreams::stream_buffer<S> sbuf(source);
    {
    std::istream is(&sbuf); // expose as std::istream
    helib::readPubKeyBinary(is, pubkey);
    }
    }

    就这样。 24行代码。它将由库作者进行测试和维护。您无法击败(显然)。我对测试进行了一些修改,因此我们不再滥用私有(private)细节。

    清理代码

    通过分离一个帮助程序来处理Blob编写,我们可以以非常相似的方式实现不同的 helib类型:
    namespace helib { // leverage ADL
    template <class A> void save(A& ar, const Context& o, unsigned) {
    Blob data = to_blob(o, writeContextBinary);
    ar << data;
    }
    template <class A> void load(A& ar, Context& o, unsigned) {
    Blob data;
    ar >> data;
    from_blob(data, o, readContextBinary);
    }
    template <class A> void save(A& ar, const PubKey& o, unsigned) {
    Blob data = to_blob(o, writePubKeyBinary);
    ar << data;
    }
    template <class A> void load(A& ar, PubKey& o, unsigned) {
    Blob data;
    ar >> data;
    from_blob(data, o, readPubKeyBinary);
    }
    }

    这对我来说是优雅的。

    完整 list

    我已经克隆了一个新的要点 https://gist.github.com/sehe/ba82a0329e4ec586363eb82d3f3b9326,其中包括以下更改集:
    0079c07 Make it compile locally
    b3b2cf1 Squelch the warnings
    011b589 Endof investigations, regroup time

    f4d79a6 Reimplemented using HElib binary IO
    a403e97 Bitwise reproducible outputs

    仅最后两个提交包含与实际修订相关的更改。

    我也将在后列出完整的代码。测试代码中有许多微妙的重组和同上注释。您最好仔细阅读它们,以了解您是否理解它们,并且含义是否适合您的需求。我留下了评论,描述了为什么测试断言是对他们有帮助的原因。
  • 文件serialization.hpp
    #ifndef EVOTING_SERIALIZATION_H
    #define EVOTING_SERIALIZATION_H

    #define BOOST_TEST_MODULE main
    #include <helib/helib.h>
    #include <boost/serialization/split_free.hpp>
    #include <boost/serialization/vector.hpp>
    #include <boost/iostreams/stream_buffer.hpp>
    #include <boost/iostreams/device/back_inserter.hpp>
    #include <boost/iostreams/device/array.hpp>

    namespace /* file-static */ {
    using Blob = std::vector<char>;

    template <typename T, typename F>
    Blob to_blob(const T& object, F writer) {
    using D = boost::iostreams::back_insert_device<Blob>;
    Blob data;
    {
    D dev(data);
    boost::iostreams::stream_buffer<D> sbuf(dev);
    std::ostream os(&sbuf); // expose as std::ostream
    writer(os, object);
    }
    return data;
    }

    template <typename T, typename F>
    void from_blob(Blob const& data, T& object, F reader) {
    boost::iostreams::stream_buffer<boost::iostreams::array_source>
    sbuf(data.data(), data.size());
    std::istream is(&sbuf); // expose as std::istream
    reader(is, object);
    }
    }

    namespace helib { // leverage ADL
    template <class A> void save(A& ar, const Context& o, unsigned) {
    Blob data = to_blob(o, writeContextBinary);
    ar << data;
    }
    template <class A> void load(A& ar, Context& o, unsigned) {
    Blob data;
    ar >> data;
    from_blob(data, o, readContextBinary);
    }
    template <class A> void save(A& ar, const PubKey& o, unsigned) {
    Blob data = to_blob(o, writePubKeyBinary);
    ar << data;
    }
    template <class A> void load(A& ar, PubKey& o, unsigned) {
    Blob data;
    ar >> data;
    from_blob(data, o, readPubKeyBinary);
    }
    }

    BOOST_SERIALIZATION_SPLIT_FREE(helib::Context)
    BOOST_SERIALIZATION_SPLIT_FREE(helib::PubKey)
    #endif //EVOTING_SERIALIZATION_H
  • 文件test-serialization.cpp
    #define BOOST_TEST_MODULE main
    #include <boost/test/included/unit_test.hpp>
    #include <helib/helib.h>
    #include <fstream>
    #include "serialization.hpp"
    #include <boost/archive/text_oarchive.hpp>
    #include <boost/archive/text_iarchive.hpp>
    #include <boost/archive/binary_oarchive.hpp>
    #include <boost/archive/binary_iarchive.hpp>

    helib::Context helibTestMinimalContext(){
    // Plaintext prime modulus
    unsigned long p = 4999;
    // Cyclotomic polynomial - defines phi(m)
    unsigned long m = 32109;
    // Hensel lifting (default = 1)
    unsigned long r = 1;
    return helib::Context(m, p, r);
    }

    helib::Context helibTestContext(){
    auto context = helibTestMinimalContext();

    // Number of bits of the modulus chain
    unsigned long bits = 300;
    // Number of columns of Key-Switching matix (default = 2 or 3)
    unsigned long c = 2;

    // Modify the context, adding primes to the modulus chain
    buildModChain(context, bits, c);
    return context;
    }

    BOOST_AUTO_TEST_CASE(serialization_pubkey) {
    auto context = helibTestContext();
    helib::SecKey secret_key(context);
    secret_key.GenSecKey();
    // Compute key-switching matrices that we need
    helib::addSome1DMatrices(secret_key);
    // Set the secret key (upcast: SecKey is a subclass of PubKey)
    const helib::PubKey& original_pubkey = secret_key;

    std::string const filename = "pubkey.serialized";

    {
    std::ofstream os(filename, std::ios::binary);
    boost::archive::binary_oarchive oarchive(os);
    oarchive << context << original_pubkey;
    }
    {
    // just checking reproducible output
    std::ofstream os(filename + ".2", std::ios::binary);
    boost::archive::binary_oarchive oarchive(os);
    oarchive << context << original_pubkey;
    }

    // reading back to independent instances of Context/PubKey
    {
    // NOTE: if you start from something rogue, it will fail with PAlgebra mismatch.
    helib::Context surrogate = helibTestMinimalContext();

    std::ifstream ifs(filename, std::ios::binary);
    boost::archive::binary_iarchive iarchive(ifs);
    iarchive >> surrogate;

    // we CAN test that the contexts end up matching
    BOOST_TEST((context == surrogate));

    helib::SecKey independent(surrogate);
    helib::PubKey& indep_pk = independent;
    iarchive >> indep_pk;
    // private again, as it should be, but to understand the relation:
    // BOOST_TEST((&independent.context == &surrogate));

    // The library's operator== compares the reference, so it would say "not equal"
    BOOST_TEST((indep_pk != original_pubkey));
    {
    // just checking reproducible output
    std::ofstream os(filename + ".3", std::ios::binary);
    boost::archive::binary_oarchive oarchive(os);
    oarchive << surrogate << indep_pk;
    }
    }

    // doing it the other way (sharing the context):
    {
    helib::PubKey restored_pubkey(context);
    {
    std::ifstream ifs(filename, std::ios::binary);
    boost::archive::binary_iarchive iarchive(ifs);
    iarchive >> context >> restored_pubkey;
    }
    // now `operator==` confirms equality
    BOOST_TEST((restored_pubkey == original_pubkey));

    {
    // just checking reproducible output
    std::ofstream os(filename + ".4", std::ios::binary);
    boost::archive::binary_oarchive oarchive(os);
    oarchive << context << restored_pubkey;
    }
    }
    }

  • 测试输出
    time ./test-serialization -l all -r detailed
    Running 1 test case...
    Entering test module "main"
    test-serialization.cpp(34): Entering test case "serialization_pubkey"
    test-serialization.cpp(61): info: check (context == surrogate) has passed
    test-serialization.cpp(70): info: check (indep_pk != original_pubkey) has passed
    test-serialization.cpp(82): info: check (restored_pubkey == original_pubkey) has passed
    test-serialization.cpp(34): Leaving test case "serialization_pubkey"; testing time: 36385217us
    Leaving test module "main"; testing time: 36385273us

    Test module "main" has passed with:
    1 test case out of 1 passed
    3 assertions out of 3 passed

    Test case "serialization_pubkey" has passed with:
    3 assertions out of 3 passed

    real 0m36,698s
    user 0m35,558s
    sys 0m0,850s

    按位可复制输出

    在重复序列化后,看起来输出确实是按位相同的,这可能是重要的属性:
    sha256sum pubkey.serialized*
    66b95adbd996b100bff58774e066e7a309e70dff7cbbe08b5c77b9fa0f63c97f pubkey.serialized
    66b95adbd996b100bff58774e066e7a309e70dff7cbbe08b5c77b9fa0f63c97f pubkey.serialized.2
    66b95adbd996b100bff58774e066e7a309e70dff7cbbe08b5c77b9fa0f63c97f pubkey.serialized.3
    66b95adbd996b100bff58774e066e7a309e70dff7cbbe08b5c77b9fa0f63c97f pubkey.serialized.4

    请注意,它在各个运行之间(显然)是不同的(因为它生成的密钥 Material 不同)。

    Side Quest(大雁追逐)

    手动改善IndexSet序列化代码的一种方法是也使用 vector<bool>:
    template<class Archive>
    void save(Archive & archive, const helib::IndexSet & index_set, const unsigned int version){
    std::vector<bool> elements;
    elements.resize(index_set.last()-index_set.first()+1);
    for (auto n : index_set)
    elements[n-index_set.first()] = true;
    archive << index_set.first() << elements;
    }

    template<class Archive>
    void load(Archive & archive, helib::IndexSet & index_set, const unsigned int version){
    long first_ = 0;
    std::vector<bool> elements;
    archive >> first_ >> elements;
    index_set.clear();
    for (size_t n = 0; n < elements.size(); ++n) {
    if (elements[n])
    index_set.insert(n+first_);
    }
    }

    更好的主意是使用 dynamic_bitset(我恰好有 contributed the serialization code(请参阅 How to serialize boost::dynamic_bitset?)):
    template<class Archive>
    void save(Archive & archive, const helib::IndexSet & index_set, const unsigned int version){
    boost::dynamic_bitset<> elements;
    elements.resize(index_set.last()-index_set.first()+1);
    for (auto n : index_set)
    elements.set(n-index_set.first());
    archive << index_set.first() << elements;
    }

    template<class Archive>
    void load(Archive & archive, helib::IndexSet & index_set, const unsigned int version) {
    long first_ = 0;
    boost::dynamic_bitset<> elements;
    archive >> first_ >> elements;
    index_set.clear();
    for (size_t n = elements.find_first(); n != -1; n = elements.find_next(n))
    index_set.insert(n+first_);
    }

    Of course, you would likely have to do similar things for IndexMap.

    关于c++ - 在二进制存档中使用Boost序列化时出错,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/61895626/

    27 4 0
    Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
    广告合作:1813099741@qq.com 6ren.com