gpt4 book ai didi

c++ - 使用 OpenMP 并行运行的最简单示例

转载 作者:行者123 更新时间:2023-12-03 12:58:18 25 4
gpt4 key购买 nike

考虑以下代码构造,

int n = 0;

#pragma omp parallel for collapse(2)
for (int i = 0; i < 3; i++)
for(int j = 0; j < 3; j++)
n++;
现在上面是我试图在需要大量时间的代码中实现的类似事情的最简单的演示。因此,主要目标是并行化循环,从而减少运行时间。
我是 OpenMP 的新手,只知道一些命令,仅此而已。现在在我上面写的代码中,最终结果是错误的(n = 9 是正确答案)。我猜,循环试图同时访问相同的内存位置。
现在有人可以为此提供一个最简单的解决方案。请注意,我对此非常不了解。任何与此相关的阅读 Material 也将有所帮助。谢谢你。

最佳答案

I guess, the loops are trying to access the same memory locationsimultaneuouly.


TL,DR : 是的,您在更新变量 n 期间存在竞争条件。 .一种解决方法是使用 OpenMP 缩减条款。

I am new to OpenMP, just know some commands and that's all. Now in thecode I have written above, the final result comes wrong (n = 9 is theright answer).


更长的答案: #pragma omp parallel for将创建一个 parallel region ,然后到 threads将使用 default chunk size 分配该区域所包含的循环的迭代次数。 ,以及 default schedule这通常是 static .但是请记住, default schedule可能因 OpenMP 的不同具体实现而异标准。
来自 OpenMP 5.1您可以阅读更正式的描述:

The worksharing-loop construct specifies that the iterations of one ormore associated loops will be executed in parallel by threads in theteam in the context of their implicit tasks. The iterations aredistributed across threads that already exist in the team that isexecuting the parallel region to which the worksharing-loop regionbinds.


Moreover ,

The parallel loop construct is a shortcut for specifying a parallelconstruct containing a loop construct with one or more associatedloops and no other statements.


或非正式地, #pragma omp parallel for是构造函数 #pragma omp parallel的组合与 #pragma omp for .
因此,在您的代码中发生的情况是您有多个线程同时修改 n 的值。 , 要解决这个问题,您应该使用 OpenMP 缩减条款,从 OpenMP 标准可以阅读:

The reduction clause can be used to perform some forms of recurrencecalculations (...) in parallel. For parallel and work-sharingconstructs, a private copy of each list item is created, one for eachimplicit task, as if the private clause had been used. (...) Theprivate copy is then initialized as specified above. At the end of theregion for which the reduction clause was specified, the original listitem is updated by combining its original value with the final valueof each of the private copies, using the combiner of the specifiedreduction-identifier.


有关减少条款如何工作的更详细说明,请查看此 SO Thread .
因此,要解决代码中的竞争条件,只需将其更改为:
 int n = 0;

#pragma omp parallel for collapse(2) reduction(+:n)
for (int i = 0; i < 3; i++)
for(int j = 0; j < 3; j++)
n++;

关于c++ - 使用 OpenMP 并行运行的最简单示例,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/65833061/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com