gpt4 book ai didi

c - MPI 矩阵乘法,进程未清理

转载 作者:太空宇宙 更新时间:2023-11-04 08:22:42 29 4
gpt4 key购买 nike

我正在尝试使用 MPI 将两个 nxn 矩阵相乘。第二个矩阵 (bb) 被广播给所有“奴隶”,然后从第一个矩阵 (aa) 发送一行来计算乘积。然后它将答案发送回主进程并存储在产品矩阵 cc 中。出于某种原因,我收到错误:

=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= EXIT CODE: 11
= CLEANING UP REMAINING PROCESSES
= YOU CAN IGNORE THE BELOW CLEANUP MESSAGES

我相信主进程正在接收从属进程发送的所有消息,反之亦然,所以我不确定这里发生了什么……有什么想法吗?

主要内容:

#include "mpi.h"
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <sys/times.h>
#define min(x, y) ((x)<(y)?(x):(y))
#define MASTER 0

double* gen_matrix(int n, int m);
int mmult(double *c, double *a, int aRows, int aCols, double *b, int bRows, int bCols);

int main(int argc, char* argv[]) {
int nrows, ncols;
double *aa; /* the A matrix */
double *bb; /* the B matrix */
double *cc1; /* A x B computed */
double *buffer; /* Row to send to slave for processing */
double *ans; /* Computed answer for master */
int myid, numprocs;
int i, j, numsent, sender;
int row, anstype;
double starttime, endtime;
MPI_Status status;

MPI_Init(&argc, &argv);
MPI_Comm_size(MPI_COMM_WORLD, &numprocs);
MPI_Comm_rank(MPI_COMM_WORLD, &myid);
if (argc > 1) {
nrows = atoi(argv[1]);
ncols = nrows;
if (myid == 0) {
/* Master Code */
aa = gen_matrix(nrows, ncols);
bb = gen_matrix(ncols, nrows);
cc1 = malloc(sizeof(double) * nrows * nrows);
starttime = MPI_Wtime();
buffer = (double*)malloc(sizeof(double) * ncols);
numsent = 0;
MPI_Bcast(bb, ncols*nrows, MPI_DOUBLE, MASTER, MPI_COMM_WORLD); /*broadcast bb to all slaves*/
for (i = 0; i < min(numprocs-1, nrows); i++) { /*for each process or row*/
for (j = 0; j < ncols; j++) { /*for each column*/
buffer[j] = aa[i * ncols + j]; /*get row of aa*/
}
MPI_Send(buffer, ncols, MPI_DOUBLE, i+1, i+1, MPI_COMM_WORLD); /*send row to slave*/
numsent++; /*increment number of rows sent*/
}
ans = (double*)malloc(sizeof(double) * ncols);
for (i = 0; i < nrows; i++) {
MPI_Recv(ans, ncols, MPI_DOUBLE, MPI_ANY_SOURCE, MPI_ANY_TAG,
MPI_COMM_WORLD, &status);
sender = status.MPI_SOURCE;
anstype = status.MPI_TAG;

for (i = 0; i < ncols; i++){
cc1[(anstype-1) * ncols + i] = ans[i];
}

if (numsent < nrows) {
for (j = 0; j < ncols; j++) {
buffer[j] = aa[numsent*ncols + j];
}
MPI_Send(buffer, ncols, MPI_DOUBLE, sender, numsent+1,
MPI_COMM_WORLD);
numsent++;
} else {
MPI_Send(MPI_BOTTOM, 0, MPI_DOUBLE, sender, 0, MPI_COMM_WORLD);
}
}

endtime = MPI_Wtime();
printf("%f\n",(endtime - starttime));
} else {
/* Slave Code */
buffer = (double*)malloc(sizeof(double) * ncols);
bb = (double*)malloc(sizeof(double) * ncols*nrows);
MPI_Bcast(bb, ncols*nrows, MPI_DOUBLE, MASTER, MPI_COMM_WORLD); /*get bb*/
if (myid <= nrows) {
while(1) {
MPI_Recv(buffer, ncols, MPI_DOUBLE, MASTER, MPI_ANY_TAG, MPI_COMM_WORLD, &status); /*recieve a row of aa*/
if (status.MPI_TAG == 0){
break;
}

row = status.MPI_TAG; /*get row number*/
ans = (double*)malloc(sizeof(double) * ncols);
for (i = 0; i < ncols; i++){
ans[i]=0.0;
}
for (i=0; i<nrows; i++){
for (j = 0; j < ncols; j++) { /*for each column*/
ans[i] += buffer[j] * bb[j * ncols + i];
}
}
MPI_Send(ans, ncols, MPI_DOUBLE, MASTER, row, MPI_COMM_WORLD);
}
}
} /*end slave code*/
} else {
fprintf(stderr, "Usage matrix_times_vector <size>\n");
}
MPI_Finalize();
return 0;
}

最佳答案

此错误消息通常意味着至少有一个 MPI 进程崩溃,并且整个 MPI 作业随后中止。它可能由任何类型的错误引起,但大多数情况下,它是由错误的内存访问引起的段错误。

我没有仔细查看代码,所以我不知道逻辑是否有效等,但我可以说的是这一行有问题:

MPI_Recv(&ans, nrows, MPI_DOUBLE, MPI_ANY_SOURCE, MPI_ANY_TAG,
MPI_COMM_WORLD, &status);

确实,这里有两个问题:

  1. &ans 是一个**double,这不是你想要的,我猜你想要的是ans
  2. ans 还没有分配所以不能作为接收缓冲区

首先尝试解决这个问题,看看会发生什么。

编辑:在您的新代码上,您分配 ans 如下:

ans = (double*)malloc(sizeof(double) * ncols);

然后你像这样初始化它:

for (i = 0; i < nrows; i++) {
ans[i]=0.0;
}

然后像这样使用它:

MPI_Send(ans, nrows, MPI_DOUBLE, MASTER, row, MPI_COMM_WORLD);

MPI_Recv(ans, nrows, MPI_DOUBLE, MPI_ANY_SOURCE, MPI_ANY_TAG,
MPI_COMM_WORLD, &status);

这不一致:ans 的大小是ncols 还是nrows

你的新错误信息是什么?

关于c - MPI 矩阵乘法,进程未清理,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/32730995/

29 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com