gpt4 book ai didi

cuda - NVVM_ERROR_INVALID_OPTION 将 CUDA 内核与 Numbapro api 一起使用时

转载 作者:太空宇宙 更新时间:2023-11-04 01:04:01 25 4
gpt4 key购买 nike

我想使用 Numbapro API 在 python 中执行 CUDA 内核。我有这段代码:

import math
import numpy
from numbapro import jit, cuda, int32, float32
from matplotlib import pyplot

@cuda.jit('void(float32[:], float32[:], float32[:], float32[:], float32, float32, float32, int32)')
def calculate_velocity_field(X, Y, u_source, v_source, x_source, y_source, strength_source, N):
start = cuda.blockIdx.x * cuda.blockDim.x + cuda.threadIdx.x
end = N
stride = cuda.gridDim.x * cuda.blockDim.x
for i in range(start, end, stride):
u_source[i] = strength_source/(2*math.pi) * (X[i]-x_source)/((X[i]-x_source)**2 + (Y[i]-y_source)**2)
v_source[i] = strength_source/(2*math.pi) * (Y[i]-x_source)/((X[i]-x_source)**2 + (Y[i]-y_source)**2)


def main():
N = 200 # number of points in each direction
x_start, x_end = -4.0, 4.0 # boundaries in the x-direction
y_start, y_end = -2.0, 2.0 # boundaries in the y-direction
x = numpy.linspace(x_start, x_end, N) # creates a 1D-array with the x-coordinates
y = numpy.linspace(y_start, y_end, N) # creates a 1D-array with the y-coordinates

X, Y = numpy.meshgrid(x, y) # generates a mesh grid

strength_source = 5.0 # source strength
x_source, y_source = -1.0, 0.0 # location of the source

start = timer()

#calculate grid dimensions
blockSize = 1024
gridSize = int(math.ceil(float(N)/blockSize))

#transfer memory to device
X_d = cuda.to_device(X)
Y_d = cuda.to_device(Y)
u_source_d = cuda.device_array_like(X)
v_source_d = cuda.device_array_like(Y)

#launch kernel
calculate_velocity_field[gridSize,blockSize](X_d,Y_d,u_source_d,v_source_d,x_source,y_source,strength_source,N)

#transfer memory to host
u_source = numpy.empty_like(X)
v_source = numpy.empty_like(Y)
u_source_d.to_host(u_source)
v_source_d.to_host(v_source)

elapsed_time = timer() - start
print("Exec time with GPU %f s" % elapsed_time)

if __name__ == "__main__":
main()

给我这个错误:

NvvmError                                 Traceback (most recent call last)
<ipython-input-17-85e4a6e56a14> in <module>()
----> 1 @cuda.jit('void(float32[:], float32[:], float32[:], float32[:], float32, float32, float32, int32)')
2 def calculate_velocity_field(X, Y, u_source, v_source, x_source, y_source, strength_source, N):
3 start = cuda.blockIdx.x * cuda.blockDim.x + cuda.threadIdx.x
4 end = N
5 stride = cuda.gridDim.x * cuda.blockDim.x

~/.anaconda3/lib/python3.4/site-packages/numba/cuda/decorators.py in kernel_jit(func)
89 # Force compilation for the current context
90 if bind:
---> 91 kernel.bind()
92
93 return kernel

~/.anaconda3/lib/python3.4/site-packages/numba/cuda/compiler.py in bind(self)
319 Force binding to current CUDA context
320 """
--> 321 self._func.get()
322
323 @property

~/.anaconda3/lib/python3.4/site-packages/numba/cuda/compiler.py in get(self)
254 cufunc = self.cache.get(device.id)
255 if cufunc is None:
--> 256 ptx = self.ptx.get()
257
258 # Link

~/.anaconda3/lib/python3.4/site-packages/numba/cuda/compiler.py in get(self)
226 arch = nvvm.get_arch_option(*cc)
227 ptx = nvvm.llvm_to_ptx(self.llvmir, opt=3, arch=arch,
--> 228 **self._extra_options)
229 self.cache[cc] = ptx
230 if config.DUMP_ASSEMBLY:

~/.anaconda3/lib/python3.4/site-packages/numba/cuda/cudadrv/nvvm.py in llvm_to_ptx(llvmir, **opts)
420 cu.add_module(llvmir.encode('utf8'))
421 cu.add_module(libdevice.get())
--> 422 ptx = cu.compile(**opts)
423 return ptx
424

~/.anaconda3/lib/python3.4/site-packages/numba/cuda/cudadrv/nvvm.py in compile(self, **options)
211 for x in opts])
212 err = self.driver.nvvmCompileProgram(self._handle, len(opts), c_opts)
--> 213 self._try_error(err, 'Failed to compile\n')
214
215 # get result

~/.anaconda3/lib/python3.4/site-packages/numba/cuda/cudadrv/nvvm.py in _try_error(self, err, msg)
229
230 def _try_error(self, err, msg):
--> 231 self.driver.check_error(err, "%s\n%s" % (msg, self.get_log()))
232
233 def get_log(self):

~/.anaconda3/lib/python3.4/site-packages/numba/cuda/cudadrv/nvvm.py in check_error(self, error, msg, exit)
118 sys.exit(1)
119 else:
--> 120 raise exc
121
122

NvvmError: Failed to compile

libnvvm : error: -arch=compute_52 is an unsupported option
NVVM_ERROR_INVALID_OPTION

我尝试了另一个 numbapro 示例,但出现了同样的错误。我不知道这是 Numbapro 不支持 5.2 计算能力的错误还是 Nvidia NVVM 的问题...建议?

理论上it should be supported ,但我不知道发生了什么。

我正在使用带有 CUDA 7.0 和驱动程序版本 346.29 的 Linux

最佳答案

终于找到了解决办法here

  • 解决方案 1:

conda update cudatoolkit

Fetching package metadata: ....
# All requested packages already installed.
# packages in environment at ~/.anaconda3:
#
cudatoolkit 6.0 p0

看来我更新 CUDA 工具包并没有更新到 CUDA 7.0。可以做第二个解决方案:

  • 解决方案 2

conda install -c numba cudatoolkit

Fetching package metadata: ...... 
Solving package specifications: .
Package plan for installation in environment ~/.anaconda3:

The following packages will be downloaded:
package | build
---------------------------|-----------------
cudatoolkit-7.0 | 1 190.8 MB

The following packages will be UPDATED:

cudatoolkit: 6.0-p0 --> 7.0-1

Proceed ([y]/n)? y

之前:

In [4]: check_cuda()
------------------------------libraries detection-------------------------------
Finding cublas
located at ~/.anaconda3/lib/libcublas.so.6.0.37
trying to open library... ok
Finding cusparse
located at ~/.anaconda3/lib/libcusparse.so.6.0.37
trying to open library... ok
Finding cufft
located at ~/.anaconda3/lib/libcufft.so.6.0.37
trying to open library... ok
Finding curand
located at ~/.anaconda3/lib/libcurand.so.6.0.37
trying to open library... ok
Finding nvvm
located at ~/.anaconda3/lib/libnvvm.so.2.0.0
trying to open library... ok
finding libdevice for compute_20... ok
finding libdevice for compute_30... ok
finding libdevice for compute_35... ok
-------------------------------hardware detection-------------------------------
Found 1 CUDA devices
id 0 b'GeForce GTX 970' [SUPPORTED]
compute capability: 5.2
pci device id: 0
pci bus id: 7
Summary:
1/1 devices are supported
PASSED
Out[4]: True

之后:

In [6]:  check_cuda()
------------------------------libraries detection-------------------------------
Finding cublas
located at ~/.anaconda3/lib/libcublas.so.7.0.28
trying to open library... ok
Finding cusparse
located at ~/.anaconda3/lib/libcusparse.so.7.0.28
trying to open library... ok
Finding cufft
located at ~/.anaconda3/lib/libcufft.so.7.0.35
trying to open library... ok
Finding curand
located at ~/.anaconda3/lib/libcurand.so.7.0.28
trying to open library... ok
Finding nvvm
located at ~/.anaconda3/lib/libnvvm.so.3.0.0
trying to open library... ok
finding libdevice for compute_20... ok
finding libdevice for compute_30... ok
finding libdevice for compute_35... ok
-------------------------------hardware detection-------------------------------
Found 1 CUDA devices
id 0 b'GeForce GTX 970' [SUPPORTED]
compute capability: 5.2
pci device id: 0
pci bus id: 7
Summary:
1/1 devices are supported
PASSED
Out[6]: True

关于cuda - NVVM_ERROR_INVALID_OPTION 将 CUDA 内核与 Numbapro api 一起使用时,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/31464063/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com