gpt4 book ai didi

c - 无法在 fork 进程中设置 OpenMP 线程关联

转载 作者:塔克拉玛干 更新时间:2023-11-02 23:57:20 26 4
gpt4 key购买 nike

我正在尝试使用 openMP 在不同的 CPU 上运行两个进程。在这种情况下,每个 CPU 有 6 个带超线程的内核(因此有 12 个硬件线程)。他们需要做一些同步,如果他们知道彼此的 PID,这似乎会更容易一些。因此,我使用 fork()execve()sigS 启动一个进程 sigC GOMP_CPU_AFFINITY 环境变量的值。 fork()/execve() 调用后,sigS 仍然具有正确的亲和性,但 sigC 打印

libgomp: no cpus left for affinity setting

并且所有线程都在同一个核心上。

sigS代码:

#define _GNU_SOURCE
#include <stdio.h>
#include <unistd.h>
#include <errno.h>
#include <omp.h>
#include <sched.h>

int main( void )
{
omp_set_num_threads(12); //12 hardware threads per CPU
//this loop runs as expected
#pragma omp parallel for
for( int i = 0; i<12; i++ ) {
#pragma omp critical
{
printf("TEST PRE-FORK: I am thread %2d running on core %d\n",
omp_get_thread_num(), sched_getcpu());
}
}

pid_t childpid = fork();

if( childpid < 0 ) {
perror("Fork failed");
} else {
if( childpid == 0 ) { //<------ attempt to set affinity for child
//change the affinity for the other process so it runs
//on the other cpu
char ompEnv[] = "GOMP_CPU_AFFINITY=6-11 18-23";
char * const args[] = { "./sigC", (char*)0 };
char * const envArgs[] = { ompEnv, (char*)0 };
execve(args[0], args, envArgs);
perror("Returned from execve");
exit(1);
} else {
omp_set_num_threads(12);
printf("PARENT: my pid = %d\n", getpid());
printf("PARENT: child pid = %d\n", childpid);
sleep(5); //sleep for a bit so child process prints first

//This loop gives the same thread core/pairings as above
//this is expected
#pragma omp parallel for
for( int i = 0; i < 12; i++ ) {
#pragma omp critical
{
printf("PARENT: I'm thread %2d, on core %d.\n",
omp_get_thread_num(), sched_getcpu());
}
}
}
}
return 0;
}

sigC 的代码中只有一个 omp parallel for 循环,但为了完整性:

#define _GNU_SOURCE
#include <stdio.h>
#include <unistd.h>
#include <errno.h>
#include <omp.h>
#include <sched.h>

int main( void )
{
omp_set_num_threads(12);
printf("CHILD: my pid = %d\n", getpid());
printf("CHILD: parent pid = %d\n", getppid());
//I expect this loop to have the core pairings as I specified in execve
//i.e thread 0 -> core 6, 1 -> 7, ... 6 -> 18, 7 -> 19 ... 11 -> 23
#pragma omp parallel for
for( int i = 0; i < 12; i++ ) {
#pragma omp critical
{
printf("CHILD: I'm thread %2d, on core %d.\n",
omp_get_thread_num(), sched_getcpu());
}
}
return 0;
}

输出:

$ env GOMP_CPU_AFFINITY="0-5 12-17" ./sigS

这部分符合预期

TEST PRE-FORK: I'm thread  0, on core 0.
TEST PRE-FORK: I'm thread 11, on core 17.
TEST PRE-FORK: I'm thread 5, on core 5.
TEST PRE-FORK: I'm thread 6, on core 12.
TEST PRE-FORK: I'm thread 3, on core 3.
TEST PRE-FORK: I'm thread 1, on core 1.
TEST PRE-FORK: I'm thread 8, on core 14.
TEST PRE-FORK: I'm thread 10, on core 16.
TEST PRE-FORK: I'm thread 7, on core 13.
TEST PRE-FORK: I'm thread 2, on core 2.
TEST PRE-FORK: I'm thread 4, on core 4.
TEST PRE-FORK: I'm thread 9, on core 15.
PARENT: my pid = 11009
PARENT: child pid = 11021

这就是问题所在 - 子进程中的所有线程都在核心 0 上运行

libgomp: no CPUs left for affinity setting
CHILD: my pid = 11021
CHILD: parent pid = 11009
CHILD: I'm thread 1, on core 0.
CHILD: I'm thread 0, on core 0.
CHILD: I'm thread 4, on core 0.
CHILD: I'm thread 5, on core 0.
CHILD: I'm thread 6, on core 0.
CHILD: I'm thread 7, on core 0.
CHILD: I'm thread 8, on core 0.
CHILD: I'm thread 9, on core 0.
CHILD: I'm thread 10, on core 0.
CHILD: I'm thread 11, on core 0.
CHILD: I'm thread 3, on core 0.

(我省略了父线程打印,因为它与预 fork 相同)

关于如何解决这个问题或者这是正确的方法有什么想法吗?

最佳答案

fork()-ed 子进程继承了它的父关联掩码。 libgomp 将此关联掩码与来自 GOMP_CPU_AFFINITY 的集合相交,并以一个空集合结束,因为这两个集合是互补的。这种行为没有记录在案,但查看 libgomp 的源代码可以确认确实是这种情况。

解决方案是在调用 execve() 之前重置子进程的关联掩码:

if (childpid == 0) { //<------ attempt to set affinity for child
cpu_set_t *mask;
size_t size;
int nrcpus = 256; // 256 CPUs should be more than enough

// Reset the CPU affinity mask
mask = CPU_ALLOC(nrcpus);
size = CPU_ALLOC_SIZE(nrcpus);
for (int i = 0; i < nrcpus; i++)
CPU_SET_S(i, size, mask);
if (sched_setaffinity(0, size, mask) == -1) { handle error }
CPU_FREE(mask);

//change the affinity for the other process so it runs
//on the other cpu
char ompEnv[] ="GOMP_CPU_AFFINITY=6-11 18-23";
char * const args[] = {"./sigC", (char*)0};
char * const envArgs[] = {ompEnv, (char*)0};
execve(args[0], args, envArgs);
perror("Returned from execve");
exit(1);
} else {

关于c - 无法在 fork 进程中设置 OpenMP 线程关联,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/15256826/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com