c - 如何在 Cuda 中的简单 if 语句中避免分歧分支-6ren

c - 如何在 Cuda 中的简单 if 语句中避免分歧分支

转载作者：太空宇宙更新时间：2023-11-04 02:46:57

25

4

我想知道，当线程必须比较和存储来自本地、共享或全局变量的值时，谁能避免内核中的分支。例如，下面的代码检查共享变量并相应地将 bool 标志设置为 true

if ( shared_variable < local_value ){
    shared_bool_var = true;
}
__syncthreads();

这里的问题是所有线程都访问同一个变量，并且都将覆盖为 true。所以我会使用 threadId.x 检查只让一个线程访问该变量，但这会导致分支分歧。

if ( threadIdx.x == 0 && shared_variable < local_value ){
    shared_bool_var = true;
}
__syncthreads();

这里的问题是我更愿意做什么？在这两种情况下，它似乎都是安全的，因为同步线程将防止危险(先读后写等)。我更喜欢第二种解决方案，但通常代码并不那么简单。

在上述情况下，允许所有线程访问 1 个共享内存位置是否安全，否则会导致存储区冲突或内存访问序列化？谢谢

最佳答案

需要注意的一件重要事情:从语义和功能上讲，两个代码节并不等同:

// set var to true if ANY thread in the block verifies the predicate
if (shared_variable < local_value) {
    shared_bool_var = true;
}

// set var to true if THE FIRST thread in the block verifies the predicate
if (threadIdx.x == 0 && shared_variable < local_value) {
    shared_bool_var = true;
}

但是回到你的问题:

In the aforementioned case, is it safe to allow all threads to access 1 shared memory location or this would cause a bank conflict or serialization of memory access?

在CUDA programming guide中验证后，似乎有某种写折叠机制可以防止对同一地址的写访问序列化:相反，只有一个线程写入它的值(但哪个线程未定义)。

CC 1.x:

If a non-atomic instruction executed by a warp writes to the same location in shared memory for more than one of the threads of the warp, only one thread per half-warp performs a write and which thread performs the final write is undefined.

CC 2.x and above:

A shared memory request for a warp does not generate a bank conflict between two threads that access any address within the same 32-bit word (even though the two addresses fall in the same bank): In that case, [...] for write accesses, each address is written by only one of the threads (which thread performs the write is undefined).

另外:

So i would use a threadId.x check to only let one thread access that variable but this would cause branch divergence.

这并不比第一个代码“更有分歧”。当整个 warp 对谓词的求值不同时，第一节就表现出分歧。第二节仅在每个 block 的第一个扭曲中表现出分歧。在这两种情况下，这些分支都不会对性能产生影响:没有 else 主体，if 主体是单个指令。

关于c - 如何在 Cuda 中的简单 if 语句中避免分歧分支，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/26316317/

25

4

0

文章推荐： javascript - Sails.js 全局引用一个包

文章推荐： node.js - sails js 无法更改 Bootstrap 的超时加载钩子(Hook)

文章推荐： c - switch 语句只返回默认大小写，无论我输入什么大小写值

python - 如何消除两个数据集的 Leibler 分歧
我有两个包含 40000 个样本的数据集。我想用 python 计算这两个数据集之间的 Kullback-Leibler 散度。在 python 中有什么有效的方法吗？最佳答案编辑: 好的。我发现
c - 宏观协议(protocol)中的错误、分歧
我正在尝试编译一个 .c 文件，用于处理 mMIPS 指令集中硬件中的裁剪，但是我似乎在“results=sfu1”行处收到错误“宏参数数量不一致” ' 当我尝试使用 lcc -o mips_mem.
javascript - Regex101 与 JavaScript String.match 分歧
这个正则表达式 /{(\w+)}/g 应该匹配大括号 {} 之间的每个单词字符。相反，我在 Regex101 JavaScript engine 中得到了不同的结果和 Chrome 控制台。 Reg

首页

博学

6Ren·AI

商城

c - 如何在 Cuda 中的简单 if 语句中避免分歧分支