c - MPI 将矩阵划分为 block

我想将矩阵分成 block (不是条纹)，然后使用 MPI_Scatter 分布这些 block 。

我提出了可行的解决方案，但我认为它远非“最佳实践”。我有一个 8x8 矩阵，填充了从 0 到 63 的数字。然后我将它分成 4 个 4x4 block ，使用 MPI_Type_vector 并通过 MPI_Send 分发它，但这需要一些额外的计算，因为我必须计算大矩阵中每个 block 的偏移量。

如果我使用散点图，第一个(左上角) block 传输正常，但其他 block 则没有( block 开始的偏移量错误)。

那么是否可以使用 MPI_Scatter 传输矩阵 block ，或者进行所需分解的最佳方法是什么？

这是我的代码:

#include <stdio.h>
#include <stdlib.h>
#include <mpi.h>

#define SIZE 8


int main(void) {

        MPI_Init(NULL, NULL);
        int p, rank;
        MPI_Comm_size(MPI_COMM_WORLD, &p);
        MPI_Comm_rank(MPI_COMM_WORLD, &rank);
        char i;

        char a[SIZE*SIZE];
        char b[(SIZE/2)*(SIZE/2)];

        MPI_Datatype columntype;
        MPI_Datatype columntype2;

        MPI_Type_vector(4, 4, SIZE, MPI_CHAR, &columntype2);
        MPI_Type_create_resized( columntype2, 0, sizeof(MPI_CHAR), &columntype );
        MPI_Type_commit(&columntype);

        if(rank == 0) {
                for( i = 0; i < SIZE*SIZE; i++) {
                        a[i] = i;
                }

                for(int rec=0; rec < p; rec++) {
                        int offset = (rec%2)*4 + (rec/2)*32;
                      MPI_Send (a+offset, 1, columntype, rec, 0, MPI_COMM_WORLD);
                }
        }
        MPI_Recv (b, 16, MPI_CHAR, 0, 0, MPI_COMM_WORLD, MPI_STATUS_IGNORE);
        //MPI_Scatter(&a, 1, boki, &b, 16, MPI_CHAR , 0, MPI_COMM_WORLD);

        printf("rank= %d  b= \n%d %d %d %d\n%d %d %d %d\n%d %d %d %d\n%d %d %d %d\n", rank, b[0], b[1], b[2], b[3], b[4], b[5], b[6], b[7], b[8], b[9], b[10], b[11], b[12], b[13], b[14], b[15]);

        MPI_Finalize();

        return 0;
}

最佳答案

您得到的几乎是“最佳实践”；在您习惯之前，它只是有点令人困惑。

不过有两件事:

首先，请注意这一点:sizeof(MPI_CHAR) 是，我假设，4 个字节，而不是 1 个。MPI_CHAR 是一个(整数)常量，它描述(到MPI 库)一个字符。您可能需要 sizeof(char) 或 SIZE/2*sizeof(char) 或任何其他方便的方式。但是调整大小的基本思路是正确的。

其次，我认为您无法使用 MPI_Scatterv，因为没有简单的方法可以使每个 block 之间的偏移量相同。也就是说，第一个 block 中的第一个元素位于 a[0]，第二个元素位于 a[SIZE/2](跳转 size/2)，接下来是 a[SIZE*(SIZE/2)]((SIZE-1)*(SIZE/2) 的跳跃)。因此，您需要能够手动生成偏移量。

以下似乎对我有用(当“大小”表示“行数”与“列数”等时，我对其进行了一些概括以使其更清楚):

#include <stdio.h>
#include <stdlib.h>
#include <mpi.h>

#define COLS  12
#define ROWS  8

int main(int argc, char **argv) {

    MPI_Init(&argc, &argv);
    int p, rank;
    MPI_Comm_size(MPI_COMM_WORLD, &p);
    MPI_Comm_rank(MPI_COMM_WORLD, &rank);
    char i;

    char a[ROWS*COLS];
    const int NPROWS=2;  /* number of rows in _decomposition_ */
    const int NPCOLS=3;  /* number of cols in _decomposition_ */
    const int BLOCKROWS = ROWS/NPROWS;  /* number of rows in _block_ */
    const int BLOCKCOLS = COLS/NPCOLS; /* number of cols in _block_ */

    if (rank == 0) {
        for (int ii=0; ii<ROWS*COLS; ii++) {
            a[ii] = (char)ii;
        }
    }

    if (p != NPROWS*NPCOLS) {
        fprintf(stderr,"Error: number of PEs %d != %d x %d\n", p, NPROWS, NPCOLS);
        MPI_Finalize();
        exit(-1);
    }
    char b[BLOCKROWS*BLOCKCOLS];
    for (int ii=0; ii<BLOCKROWS*BLOCKCOLS; ii++) b[ii] = 0;

    MPI_Datatype blocktype;
    MPI_Datatype blocktype2;

    MPI_Type_vector(BLOCKROWS, BLOCKCOLS, COLS, MPI_CHAR, &blocktype2);
    MPI_Type_create_resized( blocktype2, 0, sizeof(char), &blocktype);
    MPI_Type_commit(&blocktype);

    int disps[NPROWS*NPCOLS];
    int counts[NPROWS*NPCOLS];
    for (int ii=0; ii<NPROWS; ii++) {
        for (int jj=0; jj<NPCOLS; jj++) {
            disps[ii*NPCOLS+jj] = ii*COLS*BLOCKROWS+jj*BLOCKCOLS;
            counts [ii*NPCOLS+jj] = 1;
        }
    }

    MPI_Scatterv(a, counts, disps, blocktype, b, BLOCKROWS*BLOCKCOLS, MPI_CHAR, 0, MPI_COMM_WORLD);
    /* each proc prints it's "b" out, in order */
    for (int proc=0; proc<p; proc++) {
        if (proc == rank) {
            printf("Rank = %d\n", rank);
            if (rank == 0) {
                printf("Global matrix: \n");
                for (int ii=0; ii<ROWS; ii++) {
                    for (int jj=0; jj<COLS; jj++) {
                        printf("%3d ",(int)a[ii*COLS+jj]);
                    }
                    printf("\n");
                }
            }
            printf("Local Matrix:\n");
            for (int ii=0; ii<BLOCKROWS; ii++) {
                for (int jj=0; jj<BLOCKCOLS; jj++) {
                    printf("%3d ",(int)b[ii*BLOCKCOLS+jj]);
                }
                printf("\n");
            }
            printf("\n");
        }
        MPI_Barrier(MPI_COMM_WORLD);
    }

    MPI_Finalize();

    return 0;
}

运行:

$ mpirun -np 6 ./matrix

Rank = 0
Global matrix: 
  0   1   2   3   4   5   6   7   8   9  10  11 
 12  13  14  15  16  17  18  19  20  21  22  23 
 24  25  26  27  28  29  30  31  32  33  34  35 
 36  37  38  39  40  41  42  43  44  45  46  47 
 48  49  50  51  52  53  54  55  56  57  58  59 
 60  61  62  63  64  65  66  67  68  69  70  71 
 72  73  74  75  76  77  78  79  80  81  82  83 
 84  85  86  87  88  89  90  91  92  93  94  95 
Local Matrix:
  0   1   2   3 
 12  13  14  15 
 24  25  26  27 
 36  37  38  39 

Rank = 1
Local Matrix:
  4   5   6   7 
 16  17  18  19 
 28  29  30  31 
 40  41  42  43 

Rank = 2
Local Matrix:
  8   9  10  11 
 20  21  22  23 
 32  33  34  35 
 44  45  46  47 

Rank = 3
Local Matrix:
 48  49  50  51 
 60  61  62  63 
 72  73  74  75 
 84  85  86  87 

Rank = 4
Local Matrix:
 52  53  54  55 
 64  65  66  67 
 76  77  78  79 
 88  89  90  91 

Rank = 5
Local Matrix:
 56  57  58  59 
 68  69  70  71 
 80  81  82  83 
 92  93  94  95

关于c - MPI 将矩阵划分为 block ，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/7549316/

c - MPI 将矩阵划分为 block

上一篇：c - 旧的 C 编译器在 #ifndef#define 上阻塞

下一篇：c - 在堆栈中间查找元素