gpt4 book ai didi

CUDA 框过滤器索引错误

转载 作者:行者123 更新时间:2023-12-04 21:20:28 25 4
gpt4 key购买 nike

我为图像的框过滤编写了一个简单的 CUDA 内核。

texture<unsigned char,2> tex8u;

#define FILTER_SIZE 7
#define FILTER_OFFSET (FILTER_SIZE/2)

__global__ void box_filter_8u_c1(unsigned char* out, int width, int height, int pitch)
{
unsigned int x = blockIdx.x * blockDim.x + threadIdx.x;
unsigned int y = blockIdx.y * blockDim.y + threadIdx.y;

if(x>=width || y>=height) return;

float val = 0.0f;

for(int i = -FILTER_OFFSET; i<= FILTER_OFFSET; i++)
for(int j= -FILTER_OFFSET; j<= FILTER_OFFSET; j++)
val += tex2D(tex8u,x + i, y + j);

out[y * pitch + x] = static_cast<unsigned char>(val/(FILTER_SIZE * FILTER_SIZE));

}

上面代码的问题是图像的上边框和左边框过滤不正确。它们分别包含来自底部和右侧边框的值。错误边框的宽度等于 FILTER_OFFSET .

但是当我改变了 xy索引到 int而不是 unsigned int ,输出完美。

问题:为什么会这样?

P.S:纹理寻址模式设置为 cudaAddressModeClamp对于 x 和 y 方向。

最佳答案

其根本原因与 CUDA 无关,是基本的 C 类型转换规则导致您看到的结果。 C99 标准对转换的执行方式做了如下说明:

6.3.1.8 Usual arithmetic conversions

  1. If both operands have the same type, then no further conversion is needed.
  2. Otherwise, if both operands have signed integer types or both have unsigned integer types, the operand with the type of lesser integer conversion rank is converted to the type of the operand with greater rank.
  3. Otherwise, if the operand that has unsigned integer type has rank greater or equal to the rank of the type of the other operand, then the operand with signed integer type is converted to the type of the operand with unsigned integer type.
  4. Otherwise, if the type of the operand with signed integer type can represent all of the values of the type of the operand with unsigned integer type, then the operand with unsigned integer type is converted to the type of the operand with signed integer type.
  5. Otherwise, both operands are converted to the unsigned integer type corresponding to the type of the operand with signed integer type.


第三点意味着有符号整数(在这种情况下是 ij)首先转换为无符号整数,然后将其添加到无符号整数( xy )。将负有符号整数转换为无符号整数的结果是特定于实现的,但在这里,一个简单的二进制补码表示将把一个小的负整数变成一个非常大的无符号整数。纹理的读取模式将这个超出范围的坐标限制在纹理中允许的最大值,并且内核最终从纹理的错误一侧读取。

如果您使用有符号整数,则不会发生转换,整个问题就消失了。这个故事的寓意可能是“了解你的编程语言”。

关于CUDA 框过滤器索引错误,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/12854538/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com