c - 为什么这个程序在多个进程初始化时会卡住？

该程序通过将随机“飞镖”(采样点) throw 到一个圆或半径 = 1 的圆或刻在长度 = 2 的方板内来估算 Pi。使用关系

Area of circle / Area of Square = Pi/4

我们可以使用表示为相同的关系来估计 Pi

Darts Inside Circle / Darts Outside Circle = Pi/4

当我在 #define 中指定 NDARTS 时，程序运行良好。但是，当我将 NDARTS 指定为值 that's read via scanf and then broadcasted 时，当通过 mpirun 分配多个进程时，它会神秘地卡住:

mpirun -np 1 ./pi_montecarlo.x

   Monte Carlo Method to estimate Pi 

Introduce Number of Darts 
10000
  Number of processes: 1 
  Number of darts: 10000 
Known value of PI  : 3.1415926535 
Estimated Value of PI  : 3.1484000000
Error Percentage   : 0.21668457
Time    : 0.00060296



mpirun -np 2 ./pi_montecarlo.x

Monte Carlo Method to estimate Pi 

Introduce Number of Darts 
10000
Number of processes: 2 
Number of darts: 10000

^卡在这里。

为什么？这是一些特定于 mpi 实现的问题吗？我应该尝试另一个 MPI 实现吗(我想我正在运行 lam)？你能在你自己的盒子上运行至少 2 个进程吗？

/*
mpicc -g -Wall -lm pi_montecarlo3.c -o pi_montecarlo.x 

mpirun -np 4 ./pi_montecarlo.x
*/

#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <time.h>
#include <mpi.h>

#define MASTER 0
#define PI 3.1415926535

double pseudo_random (double a, double b) {
    double r; 
    r = ((b-a) * ((double) rand() / (double) RAND_MAX)) +a;
    return r; 
}

int main(int argc, char*argv[]){
    long long int NDARTS;

    int proc_id, 
        n_procs, 
        llimit,  
        ulimit,  
        n_circle, 
        i;      


    double pi_current, 
           pi_sum,     
           x,         
           y,         
           z,          
           error,      
           start_time, 
           end_time;   

    struct timeval stime;

    llimit = -1;
    ulimit = 1;
    n_circle =0; 

    MPI_Init(&argc, &argv); 

    MPI_Comm_rank (MPI_COMM_WORLD, &proc_id);
    MPI_Comm_size (MPI_COMM_WORLD, &n_procs);

    if (proc_id == MASTER){
        printf("\nMonte Carlo Method to estimate Pi \n\n");

            printf("Introduce Number of Darts \n");

            scanf("%lld",&NDARTS); 

        printf("  Number of processes: %d \n", n_procs);
        printf("  Number of darts: %lld \n", NDARTS);

            MPI_Bcast(&NDARTS, 1, MPI_LONG_LONG_INT, 0, MPI_COMM_WORLD);

            start_time = MPI_Wtime();
    }

    gettimeofday(&stime, NULL); 
    srand(stime.tv_usec * stime.tv_usec * stime.tv_usec * stime.tv_usec);

    for (i=1; i<=NDARTS;i++){
        x = pseudo_random(llimit, ulimit);
        y = pseudo_random(llimit, ulimit);

        z = pow(x,2) + pow(y,2);

        if (z<=1.0){
            n_circle++;
        }
    }

    pi_current = 4.0 * (double)n_circle / (double) NDARTS; 

    MPI_Reduce (&pi_current, &pi_sum, 1, MPI_DOUBLE, MPI_SUM, MASTER, MPI_COMM_WORLD);

       if (proc_id == MASTER) {
        pi_sum = pi_sum / n_procs;

        error = fabs ((pi_sum -PI) / PI) *100;

        end_time = MPI_Wtime();

        printf("Known value of PI  : %11.10f \n", PI);
        printf("Estimated Value of PI  : %11.10f\n", pi_sum);
        printf("Error Percentage   : %10.8f\n", error);
        printf("Time    : %10.8f\n\n", end_time - start_time);

    }

    MPI_Finalize();

    return 0;
}

最佳答案

广播不会将数据“推送”到其他处理器。

几乎所有的 MPI 通信都需要所有处理器的积极参与。例如，要在两个处理器之间发送消息，发送方必须调用类似 MPI_Send() 的方法，而接收方必须调用类似 MPI_Recv() 的方法。

集体交流也是如此；例如，您让每个人都调用 MPI_Reduce()。同样，您必须让每个人调用MPI_Bcast()，而不仅仅是拥有原始数据的那个，“接收者”也是如此:

if (proc_id == MASTER){
    /* ... */
    scanf("%lld",&NDARTS); 
}

MPI_Bcast(&NDARTS, 1, MPI_LONG_LONG_INT, 0, MPI_COMM_WORLD);

if (proc_id == MASTER) {
    start_time = MPI_Wtime();
}

/* ... */

顺便说一句，当您为随机数生成器设置种子时，您可能希望确保通过放置 proc_id 来确保每个处理器上的种子都不同在那里的某个地方，而不是仅仅指望时钟的不同足以将种子扔掉......

关于c - 为什么这个程序在多个进程初始化时会卡住？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/8460714/

c - 为什么这个程序在多个进程初始化时会卡住？

上一篇：c - UI + Worker 多线程问题

下一篇：mysql - 连接 MySQL 和 C