cuda - OpenCl 相当于在 CUDA 中查找连续索引-6ren

cuda - OpenCl 相当于在 CUDA 中查找连续索引

转载作者：行者123 更新时间：2023-12-02 22:43:36

在 CUDA 中，为了覆盖多个 block ，从而增加数组索引的范围，我们做了这样的事情:

主机端代码:

 dim3 dimgrid(9,1)// total 9 blocks will be launched    
 dim3 dimBlock(16,1)// each block is having 16 threads  // total no. of threads in  
                   //   the grid is thus 16 x9= 144.

设备端代码

 ...
 ...     
 idx=blockIdx.x*blockDim.x+threadIdx.x;// idx will range from 0 to 143 
 a[idx]=a[idx]*a[idx];
 ...
 ...

在 OpenCL 中实现上述情况的等效项是什么？

最佳答案

在主机上，当您使用 clEnqueueNDRangeKernel 对内核进行排队时，您必须指定全局和本地工作大小。例如:

size_t global_work_size[1] = { 144 }; // 16 * 9 == 144
size_t local_work_size[1] = { 16 };
clEnqueueNDRangeKernel(cmd_queue, kernel, 1, NULL,
                       global_work_size, local_work_size,
                       0, NULL, NULL);

在你的内核中，使用:

size_t get_global_size(uint dim);
size_t get_global_id(uint dim);
size_t get_local_size(uint dim);
size_t get_local_id(uint dim);

分别检索全局和局部工作大小和索引，其中 dim 为 0 for x，1 y 和 2 z。

因此，您的 idx 的等效项将是 size_t idx = get_global_id(0);

参见 OpenCL Reference Pages .

关于cuda - OpenCl 相当于在 CUDA 中查找连续索引，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/10393390/

文章推荐： vim - 如何使用 gvim 命令历史？

文章推荐： wpf 鼠标悬停填充矩形

文章推荐：用于在文本中查找图像的 php 正则表达式

文章推荐： ruby-on-rails - Heroku 的 Rails : Proper setup with DNS, 机架重写等

行者123

个人简介

我是一名优秀的程序员,十分优秀！

作者热门文章

滴滴打车优惠券免费领取

全站热门文章

首页

博学

6Ren·AI

商城

cuda - OpenCl 相当于在 CUDA 中查找连续索引