问题描述
我开始涉足并行计算,并且已经开始使用C进行MPI。我了解如何使用p2p(发送/接收)来做这种事情,但是我的困惑是当我尝试将集体通信与bcast和减少。
我的代码如下:
int collective(int val,int rank,int n,int *toSum){
int *globalBuf=malloc(n*sizeof(int*));
int globalSum=0;
int localSum=0;
struct timespec before;
if(rank==0){
//only rank 0 will start timer
clock_gettime(CLOCK_MONOTONIC,&before);
}
int numInts=(val*100000)/n;
int *mySum = malloc((numInts)*sizeof(int *));
int j;
for(j=rank*numInts;j<numInts*rank+numInts;j++){
localSum=localSum+(toSum[j]);
}
MPI_Bcast(&localSum,1,MPI_INT,rank,MPI_COMM_WORLD);
MPI_Reduce(&localSum,&globalSum,n,MPI_SUM,MPI_COMM_WORLD);
if(rank==0){
printf("Communicative sum = %d\n",globalSum);
//only rank 0 will end the timer
//an display
struct timespec after;
clock_gettime(CLOCK_MONOTONIC,&after);
printf("Time to complete = %f\n",(after.tv_nsec-before.tv_nsec));
}
}
传入的参数可以描述为:
val = the number of total ints that need to be summed - divided by 100000
rank= the rank of this process
n = the total number of processes
toSum = the ints that are going to be added together
我开始遇到错误的地方是我尝试广播要由等级0处理的处理器 localSum 。
我将解释我在函数调用中添加的内容,以便您可能了解我的困惑来自何处。
对于MPI_Bcast:
&localSum - the address of this processes sum
1 - there is one value that I want to broadcast,the int held by localSum
MPI_INT - meaning implied
rank - the rank of this process that is broadcasting
MPI_COMM_WORLD - meaning implied
对于MPI_Reduce
&localSum - the address of the variable that it will "reducing"
&globalSum - the address of the variable that I want to hold the reduced values of localSum
n - the number of "localSum"s that this process will reduce (n is number of processes)
MPI_INT - meaning implied
MPI_SUM - meaning implied
0 - I want rank 0 to be the process that will reduce so it can print
MPI_COMM_WORLD - meaning implied
浏览代码时,我觉得它在逻辑上是有道理的,并且可以编译,但是,当我使用m个处理器运行该程序时,会收到以下错误消息:
Assertion Failed in file src/mpi/coll/helper_fns.c at line 84: FALSE
memcpy argument memory ranges overlap,dst_=0x7fffffffd2ac src_=0x7fffffffd2a8 len_=16
internal ABORT - process 0
有人可以帮助我找到解决方案吗?抱歉,这是第二性,这只是我的第三个并行程序,并且是第一次使用bcast / reduce!
解决方法
在您的代码中提供的集体操作(MPI_Bcast
,MPI_Reduce
)调用中,我看到两个问题。首先,在MPI_Reduce
中,将每个进程的整数localSum
减少为整数globalSum
。基本上是单个整数。但是在您的MPI_Reduce
调用中,您试图降低n
的值,实际上,您只需要从n
进程中降低 1 的值。这可能会导致此错误。
如果要减少单个值,则reduce在理想情况下应像这样:
MPI_Reduce(&localSum,&globalSum,1,MPI_INT,MPI_SUM,MPI_COMM_WORLD);
对于广播,
MPI_Bcast(&localSum,rank,MPI_COMM_WORLD);
每个等级都在您的通话中广播。根据广播的一般思想,应该有一个 root 流程,该流程应将值广播到所有流程。因此,呼叫应如下所示:
int rootProcess = 0;
MPI_Bcast(&localSum,rootProcess,MPI_COMM_WORLD);
这里,rootProcess
将把localSum
中包含的值发送给所有进程。同时,所有调用此广播的进程将从rootProcess
接收值并将其存储在其本地变量localSum