CUDA 外部纹理声明-6ren

CUDA 外部纹理声明

转载作者：太空宇宙更新时间：2023-11-04 02:12:03

24

4

我想声明一次我的纹理并在我所有的内核和文件中使用它。因此，我将其声明为 extern在 header 中并将 header 包含在所有其他文件中(遵循 SO How do I use extern to share variables between source files? )

我有一个标题 cudaHeader.cuh包含我的纹理的文件:

extern texture<uchar4, 2, cudaReadModeElementType> texImage;

在我的 file1.cu ，我分配我的 CUDA 数组并将其绑定(bind)到纹理:

cudaChannelFormatDesc channelDesc = cudaCreateChannelDesc< uchar4 >( );
cudaStatus=cudaMallocArray( &cu_array_image, &channelDesc, width, height ); 
if (cudaStatus != cudaSuccess) {
    fprintf(stderr, "cudaMallocArray failed! cu_array_image couldn't be created.\n");
    return cudaStatus;
}

cudaStatus=cudaMemcpyToArray( cu_array_image, 0, 0, image, size_image, cudaMemcpyHostToDevice);
if (cudaStatus != cudaSuccess) {
    fprintf(stderr, "cudaMemcpyToArray failed! Copy from the host memory to the device texture memory failed.\n");
    return cudaStatus;
}


// set texture parameters
texImage.addressMode[0] = cudaAddressModeWrap;
texImage.addressMode[1] = cudaAddressModeWrap;
texImage.filterMode = cudaFilterModePoint;
texImage.normalized = false;    // access with normalized texture coordinates

// Bind the array to the texture
cudaStatus=cudaBindTextureToArray( texImage, cu_array_image, channelDesc);
if (cudaStatus != cudaSuccess) {
    fprintf(stderr, "cudaBindTextureToArray failed! cu_array couldn't be bind to texImage.\n");
    return cudaStatus;
}

在file2.cu ，我使用了 kernel 中的纹理功能如下:

__global__ void kernel(int width, int height, unsigned char *dev_image) {
    int x = blockIdx.x*blockDim.x + threadIdx.x;
    int y = blockIdx.y*blockDim.y + threadIdx.y;
    if(y< height) {
        uchar4 tempcolor=tex2D(texImage, x, y);

        //if(tempcolor.x==0)
        //  printf("tempcolor.x %d \n", tempcolor.x);

        dev_image[y*width*3+x*3]= tempcolor.x;
        dev_image[y*width*3+x*3+1]= tempcolor.y;
        dev_image[y*width*3+x*3+2]= tempcolor.z;
    }
}

问题是当我在我的 file2.cu 中使用它时，我的纹理不包含任何内容或损坏的值.即使我使用函数 kernel直接在file1.cu , 数据不正确。

如果我添加:texture<uchar4, 2, cudaReadModeElementType> texImage;在file1.cu和 file2.cu ，编译器说有一个重新定义。

编辑:

我在 CUDA 版本 5.0 上尝试了同样的事情但同样的问题出现了。如果我打印 texImage 的地址在file1.cu和 file2.cu ，我没有相同的地址。肯定是变量声明有问题texImage .

最佳答案

这是一个非常古老的问题，talonmies 和 Tom 在评论中提供了答案。在 CUDA 5.0 之前的场景中，extern 纹理不可行，因为缺少导致 extern 链接可能性的真正链接器。因此，正如 Tom 所提到的，

you can have different compilation units, but they cannot reference each other

在后 CUDA 5.0 场景中，extern 纹理是可能的，我想在下面提供一个简单的示例，展示它希望它对其他用户。

kernel.cu编译单元

#include <stdio.h>

texture<int, 1, cudaReadModeElementType> texture_test;

/********************/
/* CUDA ERROR CHECK */
/********************/
#define gpuErrchk(ans) { gpuAssert((ans), __FILE__, __LINE__); }
inline void gpuAssert(cudaError_t code, const char *file, int line, bool abort=true)
{
   if (code != cudaSuccess) 
   {
      fprintf(stderr,"GPUassert: %s %s %d\n", cudaGetErrorString(code), file, line);
      if (abort) exit(code);
   }
}

/*************************/
/* LOCAL KERNEL FUNCTION */
/*************************/
__global__ void kernel1() {

    printf("ThreadID = %i; Texture value = %i\n", threadIdx.x, tex1Dfetch(texture_test, threadIdx.x));

}

__global__ void kernel2();

/********/
/* MAIN */
/********/
int main() {

    const int N = 16;

    // --- Host data allocation and initialization
    int *h_data = (int*)malloc(N * sizeof(int));
    for (int i=0; i<N; i++) h_data[i] = i;

    // --- Device data allocation and host->device memory transfer
    int *d_data; gpuErrchk(cudaMalloc((void**)&d_data, N * sizeof(int)));
    gpuErrchk(cudaMemcpy(d_data, h_data, N * sizeof(int), cudaMemcpyHostToDevice));

    gpuErrchk(cudaBindTexture(NULL, texture_test, d_data, N * sizeof(int)));

    kernel1<<<1, 16>>>();
    gpuErrchk(cudaPeekAtLastError());
    gpuErrchk(cudaDeviceSynchronize());

    kernel2<<<1, 16>>>();
    gpuErrchk(cudaPeekAtLastError());
    gpuErrchk(cudaDeviceSynchronize());

    gpuErrchk(cudaUnbindTexture(texture_test));

}

kernel2.cu编译单元

#include <stdio.h>

extern texture<int, 1, cudaReadModeElementType> texture_test;

/**********************************************/
/* DIFFERENT COMPILATION UNIT KERNEL FUNCTION */
/**********************************************/
__global__ void kernel2() {

    printf("Texture value = %i\n", tex1Dfetch(texture_test, threadIdx.x));

}

记得编译生成可重定位设备代码，即-rdc = true，以启用外部链接

关于CUDA 外部纹理声明，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/12852416/

24

4

0

文章推荐： c - 将节点添加到全局链表

文章推荐： c - sh : gnuplot: command not found in Xcode

文章推荐： python - TensorFlow:以随机角度旋转图像和点时的偏移

文章推荐： c - 2-Opt 本地搜索实现

ios - 声明 'subscribe' 不能覆盖多个父类(super class)声明 (ReSwift)
我在覆盖 ReSwift Pod 中的函数时遇到问题。我有以下模拟类(class): import Foundation import Quick import Nimble import RxSwi
clang - Swift:声明 'description' 不能覆盖多个父类(super class)声明
我有一个类似于下面的继承结构。我正在采用 Printable 协议(protocol)并努力覆盖 description 属性。我遇到了一个谷歌此时似乎不知道的奇怪错误，提示为第三类，并引用了第二类和
C++ 声明
我有一个类“Cat”和 Cat 类的一个子类“DerivedCat”。 Cat 有一个函数 meow()，而 DerivedCat 覆盖了这个函数。在应用程序中，我声明了一个 Cat 对象: Cat
Kotlin变量详解：声明、赋值与最佳实践指南
Kotlin 变量变量是用于存储数据值的容器。要创建一个变量，使用 var 或 val，然后使用等号（=）给它赋值：语法 var 变量名 = 值 val 变量名 = 值示例 va
使用前的 C 声明
C 中的所有标识符在使用前都需要声明，但我找不到它在 C99 标准中表示的位置。我觉得也是指宏定义，不过定义的只是宏展开顺序。最佳答案 C99:TC3 6.5.1 §2，脚注 79 明确指出: T
仅允许在文档开头的 XML 声明
今天我的博客提要显示错误: This page contains the following errors: error on line 2 at column 6: XML declaration
ORACLE IIF 声明
在编写 IIF 语句、表和下面给出的语句时出现错误。陈述: SELECT IIF(EMP_ID=1,'True','False') from Employee; table : CREATE TAB
java - 声明、初始化和调用时不显示进度对话框
我正在创建一个登录 Activity ，我希望它在按下登录按钮时显示进度对话框，我声明、初始化并调用了它，但它没有显示。但是当我在创建时调用进度对话框时，它出现了这是我的代码: public cla
Java: vector 声明
当我输入声明语句时: Vector distance_vector = new Vector(); 我收到错误(在两种情况下都在“双”下划线): Syntax error on token "doub
kubernetes - docker容器目录被持久卷覆盖(声明)
我正在本地部署在docker-for-desktop中。这样我将来可以迁移到kubernetes集群。但是我面临一个问题。使用永久卷时，docker容器/ pod中的目录将被覆盖。我正在拉最新的S
Java 对象的初始化/声明
我有一个 MyObject 类型的对象 obj，我声明了它的实例。 MyObject obj; 但是，我没有初始化它。 MyObject 的类看起来像: public class MyObject {
java - 声明 for 更好还是相同？
关闭。这个问题是opinion-based 。目前不接受答案。想要改进这个问题吗？更新问题，以便 editing this post 可以用事实和引文来回答它。 . 已关闭 9 年前。 Improv
Java 声明 ArrayList
这个问题已经有答案了: Android: Issue during Arraylist declaration (1 个回答) 已关闭 9 年前。有时我会看到 ArrayList 声明如下 Arra
Java 声明/变量作用域问题
我对java比较陌生，经过大量搜索，我无法将相关问题的任何解决方案与我的解决方案配对。我正在尝试实现一种非常简单的方法来写入/读取数组，但编译器无法识别它。 “键盘”也是一个“无法识别的变量”。这是数
java - 声明/初始化和内存分配
简短:何时分配内存 - 在声明或初始化时？长整型:int x;将占用与int z = 10;相同的内存。此外，这对于包含更多数据的自定义对象将如何工作。假设我有这个对象: public class
c++ - 声明、定义和调用
我需要使用此程序更好地理解函数定义、声明和正确调用。我真的需要了解如何使用它们。您能否向我展示编写此程序的正确方法(所有三个都正确并进行解释)？ #include #include quad_eq
c - 函数头/声明
这是我的主要功能以及我要传递的内容。 int main(void){ struct can elC[7]; // Create an array of stucts Initiali
c# - lambda 声明
我想知道是否有更好的方法来完成此任务；我有一个对象 - 其中一个属性是字典。我有一组逗号分隔值。我需要过滤 Dictionary 并仅获取 Dictionary 值至少与其中一个值匹配的那些元素这
C++ using 声明
下面的using-declarations有什么意义 using eoPop::size; using eoPop::operator[]; using eoPop::back; using eoPo
javascript - 声明 for 循环变量的最佳实践
我的问题更像是一个关于 for 循环样式的好奇问题。在阅读别人的一些旧代码时，我遇到了一种我以前从未见过的风格。 var declaredEarlier = Array for(var i=0, le

首页

博学

6Ren·AI

商城

CUDA 外部纹理声明