gpt4 book ai didi

c++ - OpenMP 如何在归约子句中使用原子指令?

转载 作者:行者123 更新时间:2023-12-03 13:16:33 25 4
gpt4 key购买 nike

OpenMP如何使用 atomic减少构造函数中的指令?
它根本不依赖原子指令吗?
例如,变量 sum在下面的代码中累积 atomic '+'运算符(operator)?

#include <omp.h>
#include <vector>

using namespace std;
int main()
{
int m = 1000000;
vector<int> v(m);
for (int i = 0; i < m; i++)
v[i] = i;

int sum = 0;
#pragma omp parallel for reduction(+:sum)
for (int i = 0; i < m; i++)
sum += v[i];
}

最佳答案

How does OpenMP uses atomic instruction inside reduction? Doesn't itrely on atomic at all?


由于 OpenMP 标准没有指定 reduction子句应该(或不)实现(例如,是否基于 atomic 操作),它的实现可能会根据 OpenMP 标准的每个具体实现而有所不同。

For instance, is the variable sum in the code below accumulated withatomic + operator?


尽管如此,从 OpenMP 标准中,可以阅读以下内容:

The reduction clause can be used to perform some forms of recurrencecalculations (...) in parallel. For parallel and work-sharing constructs, aprivate copy of each list item is created, one for each implicit task,as if the private clause had been used. (...) The private copy isthen initialized as specified above. At the end of the region forwhich the reduction clause was specified, the original list item isupdated by combining its original value with the final value of eachof the private copies, using the combiner of the specifiedreduction-identifier.


因此,基于此,可以推断归约子句中使用的变量将是私有(private)的,因此不会自动更新。尽管如此,即使不是这种情况,OpenMP 标准的具体实现也不太可能依赖于 atomic。操作(对于指令 sum += v[i]; )因为(在这种情况下)不是最有效的策略。有关为什么会出现这种情况的更多信息,请查看以下 SO 线程:
  • Why my parallel code using openMP atomic takes a longer time than serial code? ;
  • Why should I use a reduction rather than an atomic variable? .

  • 非常非正式,比使用 atomic 更有效的方法每个线程都有自己的变量 sum 的拷贝,并在 parallel region 的末尾,每个线程将其拷贝保存到线程之间共享的资源中——现在,取决于如何实现缩减, atomic操作可能用于更新该共享资源 .然后该资源将被主线程拾取,主线程将减少其内容并更新原始 sum变量,因此。
    更正式地来自 OpenMP Reductions Under the Hood :

    After having revisited parallel reductions in detail you might stillhave some open questions about how OpenMP actually transforms yoursequential code into parallel code. In particular, you might wonderhow OpenMP detects the portion in the body of the loop that performsthe reduction. As an example, this or a similar code fragment canoften be found in code samples:

     #pragma omp parallel for reduction(+:x)
    for (int i = 0; i < n; i++)
    x -= some_value;

    You could also use - as reduction operator (which is actuallyredundant to +). But how does OpenMP isolate theupdate step x-= some_value? The discomforting answer is that OpenMPdoes not detect the update at all! The compiler treats the body of thefor-loop like this:

    #pragma omp parallel for reduction(+:x)
    for (int i = 0; i < n; i++)
    x = some_expression_involving_x_or_not(x);

    As a result, the modification of x could also be hidden behind an opaque > function call.This is a comprehensible decision from the point of view of a compilerdeveloper. Unfortunately, this means that you have to ensure that allupdates of x are compatible with the operation defined in thereduction clause.

    The overall execution flow of a reduction can be summarized asfollows:

    1. Spawn a team of threads and determine the set of iterations that each thread j has to perform.
    2. Each thread declares a privatized variant of the reduction variable x initialized with the neutral element e of the correspondingmonoid.
    3. All threads perform their iterations no matter whether or how they involve an update of the privatized variable .
    4. The result is computed as sequential reduction over the (local) partial results and the global variable x. Finally, the result iswritten back to x.

    关于c++ - OpenMP 如何在归约子句中使用原子指令?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/65406478/

    25 4 0
    Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
    广告合作:1813099741@qq.com 6ren.com