c++ - 将 1 到 100 之间的数字相加 OpenMP

标签 c++ c multithreading parallel-processing openmp

我试图仅使用 5 个线程来获取从 1 到 100 的数字总和,尽管我有 12 个可用线程。

这就是我的方法。 请告诉我哪里出错了。

#include <omp.h>
#include <stdio.h>
#include <stdlib.h>

int main (int argc, char *argv[])
{
int nthreads = 5, tid;
int sum = 0;
int a[100];

for(int i = 1; i < 101; i++){
    a[i] = i;
}

  /* Fork a team of threads giving them their own copies of variables */
#pragma omp parallel private(nthreads, tid)
nthreads = omp_get_num_threads();

#pragma omp parallel for reduction (+:sum)

    for(int i = 0; i < 100; i++){
        sum = sum + a[i + 1];
    }

    tid = omp_get_thread_num();
    printf("Current thread is %d\n", tid);
    printf("Number of threads = %d\n", nthreads);
    printf("The total sum is %d\n\n\n", sum);

  }

编辑:这是我得到的输出: Code Output

我想要的输出如下:

  • 总和是 5050,而不是 4950
  • 有没有办法输出每个线程的本地总和?

最佳答案

您的代码中存在一些问题,即:

第一:

for(int i = 1; i < 101; i++){
    a[i] = i;
}

您已将数组a分配为int a[100];,因此在循环中您将获取数组边界,将其更改为:

for(int i = 0; i < 100; i++){
    a[i] = i + 1;
}

第二:

int nthreads = 5
...
/* Fork a team of threads giving them their own copies of variables */
#pragma omp parallel private(nthreads, tid)
nthreads = omp_get_num_threads();

这没有任何意义。您创建一个并行区域,所有线程都有一个 nthreads 拷贝,它们将该变量设置为线程数,但在并行区域之后该值消失了。您随后打印 5 的唯一原因是:

printf("Number of threads = %d\n", nthreads);

是因为nthreads最初设置为5。

第三:

 for(int i = 0; i < 100; i++){
        sum = sum + a[i + 1];
    }

您再次访问数组a之外的位置,将其更改为:

 for(int i = 0; i < 100; i++){
        sum = sum + a[i];
    }

第四:

I'm trying to get the sum of numbers from 1 to 100 using only 5 threads even though I have 12 available.

因为你没有指定并行区域的线程数:

#pragma omp parallel for reduction (+:sum)

您的代码正在使用 12 个线程运行。

The Output that I want is the following:

The total sum is 5050 instead of 4950 is there a way to output the each thread's local sum?

您想要执行的操作如下:

#include <omp.h>
#include <stdio.h>
#include <stdlib.h>

int main (int argc, char *argv[])
{
   int sum = 0;
   int a[100];

   for(int i = 0; i < 100; i++){
       a[i] = i + 1;
   }
   int total_threads_used;
   // Create a parallel region with 5 threads and reduce the partial sum's values
   #pragma omp parallel num_threads(5) reduction (+:sum)
   {
        total_threads_used = omp_get_num_threads(); // Get the total threads used
        #pragma omp for
        for(int i = 0; i < 100; i++){
           sum = sum + a[i];
        }
        printf("Current thread is %d and SUM %d\n", omp_get_thread_num(), sum);  
    }
    printf("Number of threads = %d\n", total_threads_used);
    printf("The total sum is %d\n\n\n", sum);
}

输出:

Current thread is 0 and SUM 210
Current thread is 2 and SUM 1010
Current thread is 1 and SUM 610
Current thread is 3 and SUM 1410
Current thread is 4 and SUM 1810
Number of threads = 5
The total sum is 5050

从运行到运行,这些行将以不同的顺序输出,因为这些行是并行打印的,但无论运行如何,您都应该有 5 个从 0 到 4 的 "Current thread is" 、 1 “线程数 = 5” 和 1 “总和为 4950”

关于c++ - 将 1 到 100 之间的数字相加 OpenMP,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/67039119/

相关文章:

c++ - 如何从头文件中拆分模板构造函数?

c++ - std::atomic 与非原子变量的性能如何?

c++ - 3D 空间中四边形的角度

c++ - 在 C++ Shell 中更改目录时如何捕获无权限错误?

c++ - 在透视图中旋转图像

c - 将二维数组(矩阵)保存到 C 函数内的二进制文件

c - 将带有 void 指针参数的函数指针传递给函数

C99:是否可以方便地确定两个指针是否指向同一聚合?

.net - XslCompiledTransform 线程安全

java - 多线程通信: how good is the use of Atomic Variables like AtomicInteger? 为什么没有AtomicFloat?