gpt4 book ai didi

c++ boost MPI & threading - 序列化错误 : Address not mapped

转载 作者:行者123 更新时间:2023-11-30 04:27:47 27 4
gpt4 key购买 nike

我被难住了。 all_gather适用于原语(例如 int )但即使对于简单的 STL 容器也失败。 valgrind 声称容器未分配/初始化,但这似乎不对。

总结:

  • 我使用 openMP 进行一些多线程处理,然后重新加入线程。
  • 在系列中,我尝试 all_gather一个简单的std::map使用`boost::mpi::all_gather。 MPI 等级不是线程。 (有2个MPI等级,每个MPI等级有4个线程)。
  • 然后我打算再做一些(独立的)多线程。

这看起来很简单……这里可能发生了什么?

主要.cpp

#include <openmpi/mpi.h>
#include <omp.h>
#include <boost/mpi.hpp>
#include "globals.h"

int main(int argc, char* argv[])
{

int provided_MPI;
MPI_Init_thread(&argc, &argv, MPI_THREAD_MULTIPLE, &provided_MPI );

boost::mpi::environment my_boost_mpi_env(argc, argv);
boost::mpi::communicator world_MPI_boost;
world_MPI_boost_ptr = &world_MPI_boost;
// ^^^ global variable of type boost::mpi::communicator *

perform_complete_variable_elimination_schedule();
//...

}

Conn_Comp.cpp

#include <boost/mpi.hpp>    
#include <boost/mpi/collectives.hpp>
#include <boost/serialization/serialization.hpp>
#include <boost/serialization/vector.hpp>
#include <boost/serialization/map.hpp>

#include "globals.h"

...

void perform_complete_variable_elimination_schedule()
{

// isolated work in parallel using OpenMP
#pragma omp parallel
{
//work
}

// SERIAL REGION (with respect to threading).

std::map<uint,uint> my_map;
std::vector< std::map<uint,uint> > vec_of_my_maps;

boost::mpi::all_gather< std::map<uint,uint> >
(*world_MPI_boost_ptr,
my_map,
vec_of_my_maps); // <--- line 293 (referenced by valgrind)


// more isolated work in parallel using OpenMP
#pragma omp parallel
{
//work
}

}

valgrind 提示 vectormap导致无效读取。但是这个vector是在 all_gather 之前创建的调用 - 所以它显然在范围内而不是在并行线程区域中。选择的 valgrind 错误输出:

==12665== Use of uninitialised value of size 4
==12665== at 0x41C8D7A: boost::archive::detail::basic_iarchive::get_library_version() const (basic_iarchive.cpp:575)
==12665== by 0x41C92C6: boost::archive::detail::basic_iarchive::load_object(void*, boost::archive::detail::basic_iserializer const&) (basic_iarchive.cpp:399)
==12665== by 0x80F5696: void boost::mpi::all_gather<std::map<unsigned int, unsigned int, std::less<unsigned int>, std::allocator<std::pair<unsigned int const, unsigned int> > > >(boost::mpi::communicator const&, std::map<unsigned int, unsigned int, std::less<unsigned int>, std::allocator<std::pair<unsigned int const, unsigned int> > > const&, std::vector<std::map<unsigned int, unsigned int, std::less<unsigned int>, std::allocator<std::pair<unsigned int const, unsigned int> > >, std::allocator<std::map<unsigned int, unsigned int, std::less<unsigned int>, std::allocator<std::pair<unsigned int const, unsigned int> > > > >&) (iserializer.hpp:387)
==12665== by 0x80DEC83: Conn_Comp::perform_complete_variable_elimination_schedule() (Conn_Comp.cpp:**293**)
==12665== by 0x80C840A: main (main.cpp:695)
==12665==
==12665== Invalid read of size 2
==12665== at 0x41C8D7A: boost::archive::detail::basic_iarchive::get_library_version() const (basic_iarchive.cpp:575)
==12665== by 0x41C92C6: boost::archive::detail::basic_iarchive::load_object(void*, boost::archive::detail::basic_iserializer const&) (basic_iarchive.cpp:399)
==12665== by 0x80F5696: void boost::mpi::all_gather<std::map<unsigned int, unsigned int, std::less<unsigned int>, std::allocator<std::pair<unsigned int const, unsigned int> > > >(boost::mpi::communicator const&, std::map<unsigned int, unsigned int, std::less<unsigned int>, std::allocator<std::pair<unsigned int const, unsigned int> > > const&, std::vector<std::map<unsigned int, unsigned int, std::less<unsigned int>, std::allocator<std::pair<unsigned int const, unsigned int> > >, std::allocator<std::map<unsigned int, unsigned int, std::less<unsigned int>, std::allocator<std::pair<unsigned int const, unsigned int> > > > >&) (iserializer.hpp:387)
==12665== by 0x80DEC83: Conn_Comp::perform_complete_variable_elimination_schedule() (main.cpp:**293**)
==12665== by 0x80C840A: main (main.cpp:695)
==12665== Address 0x3580bece is not stack'd, malloc'd or (recently) free'd
==12665==
[drosphila:12665] *** Process received signal ***
[drosphila:12665] Signal: Segmentation fault (11)
[drosphila:12665] Signal code: Address not mapped (1)
[drosphila:12665] Failing at address: 0x3580bece
[drosphila:12665] [ 0] /lib/i686/cmov/libpthread.so.0(+0xe500) [0x44f8500]
[drosphila:12665] [ 1] /usr/lib/libboost_serialization.so.1.42.0(_ZN5boost7archive6detail14basic_iarchive11load_objectEPvRKNS1_17basic_iserializerE+0x1b7) [0x41c92c7]
[drosphila:12665] [ 2] ./detect_NAHR(_ZN5boost3mpi10all_gatherISt3mapIjjSt4lessIjESaISt4pairIKjjEEEEEvRKNS0_12communicatorERKT_RSt6vectorISD_SaISD_EE+0x587) [0x80f5697]
[drosphila:12665] [ 3] ./detect_NAHR(_ZN9Conn_Comp46perform_complete_variable_elimination_scheduleEv+0x534) [0x80dec84]
[drosphila:12665] [ 4] ./detect_NAHR(main+0xf5b) [0x80c840b]
[drosphila:12665] [ 5] /lib/i686/cmov/libc.so.6(__libc_start_main+0xe6) [0x4519ca6]
[drosphila:12665] [ 6] ./detect_NAHR() [0x80c73e1]
[drosphila:12665] *** End of error message ***

我根据 boost 的推荐使用 MPI_Init_thread help page .

正如我在顶部所说,如果我使用原语(即 uint )而不是 map ,那么 all_gather工作正常。为什么 map 会失败? boost serialize已经有序列化 STL 容器的方法,所以这不是问题...

另请注意,将保存所有值的 vector 在 all_gather 中自动调整大小。 (我检查了 all_gather 的实现)足够大以容纳所有东西。无论如何,即使我自己初始化它,它仍然失败。

最后,即使我使用普通的旧数组(正确分配),例如std::map<uint,uint> * ,我遇到了同样的问题。

最佳答案

嗯,这很尴尬。我打算留下这个问题,以防其他人有同样的奇怪错误。

我的代码的问题实际上是在 makefile 中。我忘了链接到 MPI 的 boost 库。

不正确的 makefile 标志:

-I$(BOOST_INCLUDE)     -lboost_serialization   -lboost_mpi 

显然该行包含的信息足以让程序编译和运行,但会导致运行时错误。

正确的 makefile 标志:

-L$(BOOST_LIB) -ldl -Wl,-rpath,$(BOOST_LIB) -lboost_serialization -lboost_mpi

(注意库链接标志的添加)。

关于c++ boost MPI & threading - 序列化错误 : Address not mapped,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/10643886/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com