c++ - OpenMP 如何在归约子句中使用原子指令?

标签 c++ c multithreading parallel-processing openmp

OpenMP如何使用 atomic减少构造函数中的指令?
它根本不依赖原子指令吗?
例如,变量 sum在下面的代码中累积 atomic '+'运算符(operator)?

#include <omp.h>
#include <vector>

using namespace std;
int main()
{
  int m = 1000000; 
  vector<int> v(m);
  for (int i = 0; i < m; i++)
    v[i] = i;

  int sum = 0;
  #pragma omp parallel for reduction(+:sum)
  for (int i = 0; i < m; i++)
    sum += v[i];
}

最佳答案

How does OpenMP uses atomic instruction inside reduction? Doesn't it rely on atomic at all?


由于 OpenMP 标准没有指定 reduction子句应该(或不)实现(例如,是否基于 atomic 操作),它的实现可能会根据 OpenMP 标准的每个具体实现而有所不同。

For instance, is the variable sum in the code below accumulated with atomic + operator?


尽管如此,从 OpenMP 标准中,可以阅读以下内容:

The reduction clause can be used to perform some forms of recurrence calculations (...) in parallel. For parallel and work-sharing constructs, a private copy of each list item is created, one for each implicit task, as if the private clause had been used. (...) The private copy is then initialized as specified above. At the end of the region for which the reduction clause was specified, the original list item is updated by combining its original value with the final value of each of the private copies, using the combiner of the specified reduction-identifier.


因此,基于此,可以推断归约子句中使用的变量将是私有(private)的,因此不会自动更新。尽管如此,即使不是这种情况,OpenMP 标准的具体实现也不太可能依赖于atomic。操作(对于指令 sum += v[i]; )因为(在这种情况下)不是最有效的策略。有关为什么会出现这种情况的更多信息,请查看以下 SO 线程:
  • Why my parallel code using openMP atomic takes a longer time than serial code? ;
  • Why should I use a reduction rather than an atomic variable? .

  • 非常非正式,比使用 atomic 更有效的方法每个线程都有自己的变量 sum 的拷贝,并在 parallel region 的末尾,每个线程将其拷贝保存到线程之间共享的资源中——现在,取决于如何实现缩减, atomic操作可能用于更新该共享资源 .然后该资源将被主线程拾取,主线程将减少其内容并更新原始 sum变量,因此。
    更正式地来自 OpenMP Reductions Under the Hood :

    After having revisited parallel reductions in detail you might still have some open questions about how OpenMP actually transforms your sequential code into parallel code. In particular, you might wonder how OpenMP detects the portion in the body of the loop that performs the reduction. As an example, this or a similar code fragment can often be found in code samples:

     #pragma omp parallel for reduction(+:x)
     for (int i = 0; i < n; i++)
         x -= some_value;
    

    You could also use - as reduction operator (which is actually redundant to +). But how does OpenMP isolate the update step x-= some_value? The discomforting answer is that OpenMP does not detect the update at all! The compiler treats the body of the for-loop like this:

    #pragma omp parallel for reduction(+:x)
         for (int i = 0; i < n; i++)
             x = some_expression_involving_x_or_not(x);
    

    As a result, the modification of x could also be hidden behind an opaque > function call. This is a comprehensible decision from the point of view of a compiler developer. Unfortunately, this means that you have to ensure that all updates of x are compatible with the operation defined in the reduction clause.

    The overall execution flow of a reduction can be summarized as follows:

    1. Spawn a team of threads and determine the set of iterations that each thread j has to perform.
    2. Each thread declares a privatized variant of the reduction variable x initialized with the neutral element e of the corresponding monoid.
    3. All threads perform their iterations no matter whether or how they involve an update of the privatized variable .
    4. The result is computed as sequential reduction over the (local) partial results and the global variable x. Finally, the result is written back to x.

    关于c++ - OpenMP 如何在归约子句中使用原子指令?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/65406478/

    相关文章:

    c# - 在不懂 C 的情况下开始学习 C#?

    java - 在 JUnit 中测试没有 sleep 的潜在死锁

    c++ - 在 C++ 中接受基于线程表现不同的套接字包括

    python - 使用 Paho MQTT 在 Python 中订阅 MQTT 时出现线程问题

    c++ - 使用 str += "A"或 str = str + "A"连接字符串之间的性能差异

    c++ - 如何将模板类 X 的模板成员函数声明为嵌套类 X::Y 的友元

    c++ - 我如何在 cout/c++ 中写这个?

    c++ - 使用更多像素时 Neopixel 示例代码崩溃

    c++ - 使 char 函数参数为 const?

    c - 将数组中的元素右移 1