gpt4 book ai didi

CUDA:将数组复制到 GPU

转载 作者:行者123 更新时间:2023-11-30 20:11:16 33 4
gpt4 key购买 nike

我正在尝试创建一个函数,该函数将值分配并复制到设备上的数组中。由于某种原因,该代码内联工作正常,但当我将其放入函数中时却无法正常工作,如下所示。我传递的参数是否正确?

我的 CUDA 程序仍然执行(没有错误),但是如果我读回我写入的值,它们似乎是未定义的,即。不是 HostArray 中的原始值。

void CopyArrayToGPU(double *DeviceArray, double *HostArray, int NumElements)
{
int bytes = sizeof(double) * NumElements;

// Allocate memory on the GPU for array
if (cudaMalloc((void**)&DeviceArray, bytes) != cudaSuccess)
{
printf("CopyArrayToGPU(): Couldn't allocate mem for array on GPU.");
}

// Copy the contents of the host array to the GPU
if (cudaMemcpy(DeviceArray, HostArray, bytes, cudaMemcpyHostToDevice) != cudaSuccess)
{
printf("CopyArrayToGPU(): Couldn't copy host array to GPU.");
}
}

// Declare device side array
static double *d_Inputs;

// Allocate host side array and plug in some test values
double *testArray = (double*)malloc(ImageSize);
testArray[0] = 0.6455696203; testArray[1] = 0.7954545455; testArray[2] = 0.2028985507; testArray[3] = 0.08;

CopyArrayToGPU(d_Inputs, testArray, NUM_INPUTS * NUM_ROWS);

最佳答案

您的变量 DeviceArray 是您的函数的区域设置

您必须采用double **DeviceArray作为参数或返回cuda分配的值。

返回值版本:

double *CopyArrayToGPU(double *HostArray, int NumElements)
{
int bytes = sizeof(double) * NumElements;
void *DeviceArray;

// Allocate memory on the GPU for array
if (cudaMalloc(&DeviceArray, bytes) != cudaSuccess)
{
printf("CopyArrayToGPU(): Couldn't allocate mem for array on GPU.");
return NULL;
}

// Copy the contents of the host array to the GPU
if (cudaMemcpy(DeviceArray, HostArray, bytes, cudaMemcpyHostToDevice) != cudaSuccess)
{
printf("CopyArrayToGPU(): Couldn't copy host array to GPU.");
cudaFree(DeviceArray);
return NULL;
}

return DeviceArray;
}

所以你必须这样调用它:

d_Inputs = CopyArrayToGPU(testArray, NUM_INPUTS * NUM_ROWS);
<小时/>

double **DeviceArray 作为参数版本:

int CopyArrayToGPU(double **DeviceArray, double *HostArray, int NumElements)
{
int bytes = sizeof(double) * NumElements;

// Allocate memory on the GPU for array
if (cudaMalloc((void **)DeviceArray, bytes) != cudaSuccess)
{
printf("CopyArrayToGPU(): Couldn't allocate mem for array on GPU.");
return 1;
}

// Copy the contents of the host array to the GPU
if (cudaMemcpy(*DeviceArray, HostArray, bytes, cudaMemcpyHostToDevice) != cudaSuccess)
{
printf("CopyArrayToGPU(): Couldn't copy host array to GPU.");
cudaFree(*DeviceArray);
return 1;
}

return 0;
}

所以你必须这样调用它:

CopyArrayToGPU(&d_Inputs, testArray, NUM_INPUTS * NUM_ROWS);

关于CUDA:将数组复制到 GPU,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/41353787/

33 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com