gpt4 book ai didi

openmp - 必须在最后订购吗?

转载 作者:行者123 更新时间:2023-12-03 21:31:28 26 4
gpt4 key购买 nike

#pragma omp parallel for ordered
for (int i = 0; i < n; ++i) {
... code happens nicely in parallel here ...
#pragma omp ordered
{
.. one at a time in order of i, as expected, good ...
}
... single threaded here but I expected parallel ...
}

我希望下一个线程在该线程离开有序部分后立即进入有序部分。但是下一个线程只有在 for 循环体结束时才进入有序部分。所以有序部分结束后的代码是串行的。

OpenMP 4.0 手册包含:

The ordered construct specifies a structured block in a loop region that will be executed in the order of the loop iterations. This sequentializes and orders the code within an ordered region while allowing code outside the region to run in parallel.

我添加粗体的地方。我正在阅读“外部”以在订购的部分结束后包括在内。

这是预期的吗?订购的部分实际上必须在最后吗?

我搜索了一个答案,并确实找到了另一个地方,有人在将近 2 年前观察到类似的情况:https://stackoverflow.com/a/32078625/403310 :

Testing with gfortran 5.2, it appears everything after the ordered region is executed in order for each loop iteration, so having the ordered block at the beginning of the loop leads to serial performance while having the ordered block at the end of the loop does not have this implication as the code before the block is parallelized. Testing with ifort 15 is not as dramatic but I would still recommend structuring your code so your ordered block occurs after any code than needs parallelization in a loop iteration rather than before.

我在 Ubuntu 16.04 上使用 gcc 5.4.0。

非常感谢。

最佳答案

ordered 区域不需要放在最后。您观察到的行为取决于实现,并且是 libgomp(来自 gcc 的 OpenMP 运行时库)中的一个已知缺陷。我想标准可以容忍这种行为,但显然不是最优的。

从技术上讲,编译器会根据注释生成以下代码:

#pragma omp parallel for ordered
for (int i = 0; i < n; ++i) {
... code happens nicely in parallel here ...
GOMP_ordered_start();
{
.. one at a time in order of i, as expected, good ...
}
GOMP_ordered_end();
... single threaded here but I expected parallel ...
GOMP_loop_ordered_static_next();
}

不幸的是,GOMP_ordered_endimplemented as follows :

/* This function is called by user code when encountering the end of an
ORDERED block. With the current ORDERED implementation there's nothing
for us to do.

However, the current implementation has a flaw in that it does not allow
the next thread into the ORDERED section immediately after the current
thread exits the ORDERED section in its last iteration. The existance
of this function allows the implementation to change. */

void
GOMP_ordered_end (void)
{
}

我推测,这从来都不是一个重要的用例,因为 ordered 可能通常用于以下方面:

#pragma omp parallel for ordered
for (...) {
result = expensive_computation()
#pragma omp ordered
{
append(results, result);
}
}

来自英特尔编译器的 OpenMP 运行时没有此缺陷。

关于openmp - 必须在最后订购吗?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/43540605/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com