gpt4 book ai didi

go - CUDA 内核包装器的共享库 undefined reference

转载 作者:IT王子 更新时间:2023-10-29 02:02:13 27 4
gpt4 key购买 nike

因此,我尝试在 Windows 上将 CUDA Runtime API 与 Go 的 cgo 结合使用。我已经这样做了几天了,但卡住了:我得到了对我的内核包装器的 undefined reference 。

我已经分离出我的内核并将其包装到下面

文件:cGo.cuh

typedef unsigned long int ktype;
typedef unsigned char glob;

/*
function Prototypes
*/

extern "C" void kernel_kValid(int, int, ktype *, glob *);

__global__ void kValid(ktype *, glob *);

文件:cGo.cu

#include "cGo.cuh"
#include "device_launch_parameters.h"
#include "cuda.h"
#include "cuda_runtime.h"

//function Definitions

/*
kernel_kValid is a wrapper function for the CUDA Kernel to be called from Go
*/
extern "C" void kernel_kValid(int blocks, int threads, ktype *kInfo, glob *values) {
kValid<<<blocks, threads>>>(kInfo, values);//execute the kernel
}


/*
kValid is the CUDA Kernel which is to be executed
*/
__global__ void kValid(ktype *kInfo, glob *values) {
//lots of code
}

我将我的 CUDA 源代码编译成一个共享库:

nvcc -shared -o myLib.so cGo.cu

然后我创建了一个头文件以包含在我的 cgo 中

文件:cGo.h

typedef unsigned long int ktype;
typedef unsigned char glob;

/*
function Declarations
*/

void kernel_kValid(int , int , ktype *, glob *);

然后从 go 包中我利用 cgo 调用我的内核包装器

package cuda
/*
#cgo LDFLAGS: -LC:/Storage/Cuda/lib/x64 -lcudart //this is the Cuda library
#cgo LDFLAGS: -L${SRCDIR}/lib -lmyLib //this is my shared library
#cgo CPPFLAGS: -IC:/Storage/Cuda/include //this contains cuda headers
#cgo CPPFLAGS: -I${SRCDIR}/include //this contains cGo.h

#include <cuda_runtime.h>
#include <stdlib.h>
#include "cGo.h"
*/
import "C"

func useKernel(){
//other code
C.kernel_kValid(C.int(B), C.int(T), unsafe.Pointer(storageDevice), unsafe.Pointer(globDevice))
cudaErr, err = C.cudaDeviceSynchronize()
//rest of the code
}

所以对 CUDA 运行时 API 的所有调用都不会抛出错误,它只是我的内核包装器。这是我用 go 构建 cuda 包时的输出。

C:\Users\user\Documents\Repos\go\cuda_wrapper>go build cuda_wrapper\cuda
# cuda_wrapper/cuda
In file included from C:/Storage/Cuda/include/host_defines.h:50:0,
from C:/Storage/Cuda/include/device_types.h:53,
from C:/Storage/Cuda/include/builtin_types.h:56,
from C:/Storage/Cuda/include/cuda_runtime.h:86,
from C:\Go\workspace\src\cuda_wrapper\cuda\cuda.go:12:
C:/Storage/Cuda/include/crt/host_defines.h:84:0: warning: "__cdecl" redefined
#define __cdecl

<built-in>: note: this is the location of the previous definition
# cuda_wrapper/cuda
C:\Users\user\AppData\Local\Temp\go-build038297194\cuda_wrapper\cuda\_obj\cuda.cgo2.o: In function `_cgo_440ebb0a3e25_Cfunc_kernel_kValid':
/tmp/go-build\cuda_wrapper\cuda\_obj/cgo-gcc-prolog:306: undefined reference to `kernel_kValid'
collect2.exe: error: ld returned 1 exit status

就在这里,我不太确定哪里出了问题。我一直在查看有关使用 cgo undefined reference 的问题,但我发现没有任何问题可以解决我的问题。我也一直在研究 CUDA 运行时 API 是用 C++ 编写的,这是否会影响 cgo 编译它的方式,但我还是没有发现任何结论。在这一点上,我认为自己比其他任何事情都更困惑,所以我希望更有知识的人能给我指明正确的方向。

最佳答案

名字管理很好。

这是我们用于 gorgonia 的解决方案:

#include <math.h>

#ifdef __cplusplus
extern "C" {
#endif


__global__ void sigmoid32(float* A, int size)
{
int blockId = blockIdx.x + blockIdx.y * gridDim.x + gridDim.x * gridDim.y * blockIdx.z;
int idx = blockId * (blockDim.x * blockDim.y * blockDim.z) + (threadIdx.z * (blockDim.x * blockDim.y)) + (threadIdx.y * blockDim.x) + threadIdx.x;
if (idx >= size) {
return;
}
A[idx] = 1 / (1 + powf((float)(M_E), (-1 * A[idx])));
}

#ifdef __cplusplus
}
#endif

所以...只需将内核包装器函数包装在 extern "C"

关于go - CUDA 内核包装器的共享库 undefined reference ,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/49042518/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com