gpt4 book ai didi

c++ - 调试分布式 OR 操作

转载 作者:太空宇宙 更新时间:2023-11-04 16:04:47 24 4
gpt4 key购买 nike

我正在尝试对存储在分布式系统的不同节点中的整数数组执行按位运算。计算结果后我想将结果分发到每个节点。为此,我尝试使用 MPI_Allreduce 操作。但是我收到以下代码的运行时错误。

#include <bits/stdc++.h>
#include <mpi.h>
#include <unistd.h>
using namespace std;

int main (int argc, char* argv[]) {
int numtasks, taskid, n=200, i;

MPI_Init(&argc, &argv);
MPI_Comm_size(MPI_COMM_WORLD, &numtasks);
MPI_Comm_rank(MPI_COMM_WORLD, &taskid);

int *arr;
arr = new int[n];
for(i=0; i<n; i++) arr[i] = 0;
for(i=taskid; i<n; i+=numtasks) arr[i] = 1;

MPI_Allreduce(arr, arr, n, MPI_INT, MPI_BOR, MPI_COMM_WORLD);

if(taskid == 0){
for(i=0; i<n; i++) printf("%d ", arr[i]);
printf("\n");
}

MPI_Finalize();
return 0;
}

程序在 n 为 1 时正常运行,但当 n>1 时,它在运行时出现以下错误。

[user:17026] An error occurred in MPI_Allreduce
[user:17026] on communicator MPI_COMM_WORLD
[user:17026] MPI_ERR_BUFFER: invalid buffer pointer
[user:17026] MPI_ERRORS_ARE_FATAL: your MPI job will now abort


mpirun has exited due to process rank 2 with PID 17028 on node userexiting improperly. There are two reasons this could occur:

  1. this process did not call "init" before exiting, but others in the job did. This can cause a job to hang indefinitely while it waits forall processes to call "init". By rule, if one process calls "init",then ALL processes must call "init" prior to termination.

  2. this process called "init", but exited without calling "finalize". By rule, all processes that call "init" MUST call "finalize" prior toexiting or it will be considered an "abnormal termination"

This may have caused other processes in the application to beterminated by signals sent by mpirun (as reported here).


[user:17025] 3 more processes have sent help message help-mpi-errors.txt / mpi_errors_are_fatal
[user:17025] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages

我想知道 MPI_Allreduce 是否适用于 n>1,因为互联网上提供的大多数示例仅将 n 设为 1。如果我的方法完全错误,请为我的问题提出更好的解决方案。

最佳答案

如果你想使用相同的缓冲区来发送和接收,你可以指定MPI_IN_PLACE作为发送缓冲区。

MPI_Allreduce(MPI_IN_PLACE, arr, n, MPI_INT, MPI_BOR, MPI_COMM_WORLD);

注意:这仅适用于内部通信器。 MPI_COMM_WORLD 是内部通信器。如果您不知道什么是内部沟通者,那么您的沟通者可能是内部沟通者。

关于c++ - 调试分布式 OR 操作,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/37406146/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com