gpt4 book ai didi

c++ - 使用 "cuFFT Device Callbacks"

转载 作者:太空狗 更新时间:2023-10-29 21:19:55 25 4
gpt4 key购买 nike

这是我的第一个问题,所以我会尽量详细。我正致力于在 CUDA 6.5 中实现降噪算法。我的代码基于这个 Matlab 实现:http://pastebin.com/HLVq48C1 .
我很想使用新的 cuFFT 设备回调功能,但我受困于 cufftXtSetCallback。每次我的 cufftResult 都是 CUFFT_NOT_IMPLEMENTED (14)。甚至 nVidia 提供的示例也以同样的方式失败......我的设备回调测试代码:

__device__ void noiseStampCallback(void *dataOut,
size_t offset,
cufftComplex element,
void *callerInfo,
void *sharedPointer) {
element.x = offset;
element.y = 2;
((cufftComplex*)dataOut)[offset] = element;
}
__device__ cufftCallbackStoreC noiseStampCallbackPtr = noiseStampCallback;

我的代码的 CUDA 部分:

cufftHandle forwardFFTPlan;//RtC
//find how many windows there are
int batch = targetFile->getNbrOfNoiseWindows();
size_t worksize;

cufftCreate(&forwardFFTPlan);
cufftMakePlan1d(forwardFFTPlan, WINDOW, CUFFT_R2C, batch, &worksize); //WINDOW = 2048

//host memory, allocate
float *h_wave;
cufftComplex *h_complex_waveSpec;
unsigned int m_num_real_elems = batch*WINDOW*2;
h_wave = (float*)malloc(m_num_real_elems * sizeof(float));
h_complex_waveSpec = (cufftComplex*)malloc((m_num_real_elems/2+1)*sizeof(cufftComplex));

//init
memset(h_wave, 0, sizeof(float) * m_num_real_elems); //last window won't probably be full of file data, so fill memory with 0
memset(h_complex_waveSpec, 0, sizeof(cufftComplex) * (m_num_real_elems/2+1));
targetFile->getNoiseFile(h_wave); //fill h_wave with samples from sound file

//device memory, allocate, copy from host
float *d_wave;
cufftComplex *d_complex_waveSpec;

cudaMalloc((void**)&d_wave, m_num_real_elems * sizeof(float));
cudaMalloc((void**)&d_complex_waveSpec, (m_num_real_elems/2+1) * sizeof(cufftComplex));

cudaMemcpy(d_wave, h_wave, m_num_real_elems * sizeof(float), cudaMemcpyHostToDevice);

//prepare callback
cufftCallbackStoreC hostNoiseStampCallbackPtr;

cudaMemcpyFromSymbol(&hostNoiseStampCallbackPtr,
noiseStampCallbackPtr,
sizeof(hostNoiseStampCallbackPtr));

cufftResult status = cufftXtSetCallback(forwardFFTPlan,
(void **)&hostNoiseStampCallbackPtr,
CUFFT_CB_ST_COMPLEX,
NULL);
//always return status 14 - CUFFT_NOT_IMPLEMENTED

//run forward plan
cufftResult result = cufftExecR2C(forwardFFTPlan, d_wave, d_complex_waveSpec);
//result seems to be okay without cufftXtSetCallback

我知道我只是 CUDA 的初学者。我的问题是:
如何正确调用 cufftXtSetCallback 或导致此错误的原因是什么?

最佳答案

引用documentation :

The callback API is available in the statically linked cuFFT library only, and only on 64 bit LINUX operating systems. Use of this API requires a current license. Free evaluation licenses are available for registered developers until 6/30/2015. To learn more please visit the cuFFT developer page.

我认为您收到未实现的错误是因为您不是在 Linux 64 位平台上,或者您没有明确链接到 CUFFT 静态库。 cufft callback sample 中的 Makefile将给出正确的链接方法。

即使您解决了该问题,您也可能会遇到 CUFFT_LICENSE_ERROR,除非您已获得其中一个评估许可证。

请注意,有各种 device limitations as well用于链接到 cufft 静态库。应该可以构建一个静态链接的 CUFFT 应用程序,该应用程序将在 cc 2.0 和更高版本的设备上运行。

关于c++ - 使用 "cuFFT Device Callbacks",我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/25822128/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com