- Java 双重比较
- java - 比较器与 Apache BeanComparator
- Objective-C 完成 block 导致额外的方法调用?
- database - RESTful URI 是否应该公开数据库主键?
我了解到 OpenMP 使用线程池来重用物理线程。我的问题是从omp_get_thread_num
获取的线程号是否绑定(bind)到物理线程?
换句话说,omp_get_thread_num
到 gettid
( gettid man page ) 的映射在所有并行区域中是否始终相同?
OpenMP 规范的第 3.2.4 节 (link)
Binding
The binding thread set for an omp_get_thread_num region is the current team. The binding region for an omp_get_thread_num region is the innermost enclosing parallel region.
Effect
The omp_get_thread_num routine returns the thread number of the calling thread, within the 10 team executing the parallel region to which the routine region binds. The thread number is an integer between 0 and one less than the value returned by omp_get_num_threads , inclusive. The thread number of the master thread of the team is 0. The routine returns 0 if it is called from the sequential part of a program.
使用gettid系统调用的简单测试
使用 GCC 的 CentOS 7 下面的代码为我提供了两个并行区域的相同映射。但我不确定这是否只是一个特例。
#include <unistd.h>
#include <sys/syscall.h>
#include <iostream>
#include <omp.h>
int main(int argc, char *argv[]) {
std::cout << "Entering region 1:" << std::endl;
#pragma omp parallel
{
#pragma omp critical
std::cout << "num: "<< omp_get_thread_num() << " => tid: " << syscall(__NR_gettid) << std::endl;
}
std::cout << "------------------------------------------------------------" << std::endl;
std::cout << "Entering region 2:" << std::endl;
#pragma omp parallel
{
#pragma omp critical
std::cout << "num: "<< omp_get_thread_num() << " => tid: " << syscall(__NR_gettid) << std::endl;
}
return 0;
}
这是我在 CentOS 7 中使用 GCC (5.2) 获得的输出。
Entering region 1:
num: 0 => tid: 625
num: 5 => tid: 630
num: 7 => tid: 632
num: 11 => tid: 636
num: 3 => tid: 628
num: 13 => tid: 638
num: 1 => tid: 626
num: 9 => tid: 634
num: 6 => tid: 631
num: 10 => tid: 635
num: 12 => tid: 637
num: 2 => tid: 627
num: 4 => tid: 629
num: 8 => tid: 633
num: 14 => tid: 639
num: 15 => tid: 640
------------------------------------------------------------
Entering region 2:
num: 4 => tid: 629
num: 12 => tid: 637
num: 15 => tid: 640
num: 5 => tid: 630
num: 8 => tid: 633
num: 13 => tid: 638
num: 0 => tid: 625
num: 9 => tid: 634
num: 1 => tid: 626
num: 6 => tid: 631
num: 3 => tid: 628
num: 7 => tid: 632
num: 10 => tid: 635
num: 11 => tid: 636
num: 2 => tid: 627
num: 14 => tid: 639
编译:g++ toy.cpp -fopenmp
最佳答案
不能保证跨多个并行区域。这是一个稍微修改过的示例:
int main(int argc, char *argv[]) {
std::cout << "Entering region 1:" << std::endl;
#pragma omp parallel
{
#pragma omp critical
std::cout << "num: "<< omp_get_thread_num() << " => tid: " << syscall(__NR_gettid) << std::endl;
}
std::cout << "------------------------------------------------------------" << std::endl;
std::cout << "Entering region 2:" << std::endl;
// shrinks the threadpool for libgomp
#pragma omp parallel num_threads(2)
{
#pragma omp critical
std::cout << "num: "<< omp_get_thread_num() << " => tid: " << syscall(__NR_gettid) << std::endl;
}
std::cout << "------------------------------------------------------------" << std::endl;
std::cout << "Entering region 3:" << std::endl;
#pragma omp parallel
{
#pragma omp critical
std::cout << "num: "<< omp_get_thread_num() << " => tid: " << syscall(__NR_gettid) << std::endl;
}
return 0;
}
这是输出(gcc 8.2.1):
Entering region 1:
num: 0 => tid: 11845
num: 6 => tid: 11851
num: 3 => tid: 11848
num: 5 => tid: 11850
num: 7 => tid: 11852
num: 4 => tid: 11849
num: 2 => tid: 11847
num: 1 => tid: 11846
------------------------------------------------------------
Entering region 2:
num: 1 => tid: 11846
num: 0 => tid: 11845
------------------------------------------------------------
Entering region 3:
num: 2 => tid: 11853
num: 7 => tid: 11858
num: 5 => tid: 11856
num: 4 => tid: 11855
num: 1 => tid: 11846
num: 3 => tid: 11854
num: 0 => tid: 11845
num: 6 => tid: 11857
OpenMP 标准未指定跨并行区域的线程池。
关于c++ - 在 openmp 中,omp_get_thread_num 是否绑定(bind)到物理线程?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/52907823/
代码有几千行,所以我不能粘贴它们,但是函数流程看起来有点像: void Func_1(double * x, int nx, NUM_THREADS) { omp_set_num_threads(
我刚开始使用英特尔 Fortran 编译器和 Visual Studio 2015 在 Fortran 中使用 OpenMP。在项目属性中,我将“Fortran -> 语言 -> 处理 OpenMP
我有一个这样的代码: thread_local CustomAllocator* ts_alloc = nullptr; struct AllocatorSetup { AllocatorSe
与查找变量值相比,调用 omp_get_thread_num() 的性能成本是多少? 如何避免在 simd openmp 循环中多次调用 omp_get_thread_num()? 我可以使用 #pr
我有一个使用 omp 进行并行化的 C++ 类库。我注意到我的问题,因为它总是用完我处理器上的所有内核,不管 omp_set_num_threads(threadCount) 有什么作为输入。 所以在
我了解到 OpenMP 使用线程池来重用物理线程。我的问题是从omp_get_thread_num 获取的线程号是否绑定(bind)到物理线程? 换句话说,omp_get_thread_num 到 g
我是一名优秀的程序员,十分优秀!