gpt4 book ai didi

c - 如何在 Cuda 中的简单 if 语句中避免分歧分支

转载 作者:太空宇宙 更新时间:2023-11-04 02:46:57 25 4
gpt4 key购买 nike

我想知道,当线程必须比较和存储来自本地、共享或全局变量的值时,谁能避免内核中的分支。例如,下面的代码检查共享变量并相应地将 bool 标志设置为 true

if ( shared_variable < local_value ){
shared_bool_var = true;
}
__syncthreads();

这里的问题是所有线程都访问同一个变量,并且都将覆盖为 true。所以我会使用 threadId.x 检查只让一个线程访问该变量,但这会导致分支分歧。

if ( threadIdx.x == 0 && shared_variable < local_value ){
shared_bool_var = true;
}
__syncthreads();

这里的问题是我更愿意做什么?在这两种情况下,它似乎都是安全的,因为同步线程将防止危险(先读后写等)。我更喜欢第二种解决方案,但通常代码并不那么简单。

在上述情况下,允许所有线程访问 1 个共享内存位置是否安全,否则会导致存储区冲突或内存访问序列化?谢谢

最佳答案

需要注意的一件重要事情:从语义和功能上讲,两个代码节并不等同:

// set var to true if ANY thread in the block verifies the predicate
if (shared_variable < local_value) {
shared_bool_var = true;
}

// set var to true if THE FIRST thread in the block verifies the predicate
if (threadIdx.x == 0 && shared_variable < local_value) {
shared_bool_var = true;
}

但是回到你的问题:

In the aforementioned case, is it safe to allow all threads to access 1 shared memory location or this would cause a bank conflict or serialization of memory access?

CUDA programming guide中验证后,似乎有某种写折叠机制可以防止对同一地址的写访问序列化:相反,只有一个线程写入它的值(但哪个线程未定义)。

CC 1.x:

If a non-atomic instruction executed by a warp writes to the same location in shared memory for more than one of the threads of the warp, only one thread per half-warp performs a write and which thread performs the final write is undefined.

CC 2.x and above:

A shared memory request for a warp does not generate a bank conflict between two threads that access any address within the same 32-bit word (even though the two addresses fall in the same bank): In that case, [...] for write accesses, each address is written by only one of the threads (which thread performs the write is undefined).

另外:

So i would use a threadId.x check to only let one thread access that variable but this would cause branch divergence.

这并不比第一个代码“更有分歧”。当整个 warp 对谓词的求值不同时,第一节就表现出分歧。第二节仅在每个 block 的第一个扭曲中表现出分歧。在这两种情况下,这些分支都不会对性能产生影响:没有 else 主体,if 主体是单个指令。

关于c - 如何在 Cuda 中的简单 if 语句中避免分歧分支,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/26316317/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com