gpt4 book ai didi

python - numbapro 中的 cuda 代码错误

转载 作者:行者123 更新时间:2023-12-01 05:38:54 26 4
gpt4 key购买 nike

import numpy
import numpy as np
from numbapro import cuda


@cuda.autojit
def foo(aryA, aryB,out):
d_ary1 = cuda.to_device(aryA)
d_ary2 = cuda.to_device(aryB)
#dd = numpy.empty(10, dtype=np.int32)
d_ary1.copy_to_host(out)


griddim = 1, 2
blockdim = 3, 4
aryA = numpy.arange(10, dtype=np.int32)
aryB = numpy.arange(10, dtype=np.int32)
out = numpy.empty(10, dtype=np.int32)

foo[griddim, blockdim](aryA, aryB,out)

Exception: Caused by input line 11: can only get attribute from globals, complex numbers or arrays

我是 numbapro 的新手,需要提示!

最佳答案

@cuda.autotjitfoo() 标记并编译为 CUDA 内核。内存传输操作应该放在内核之外。它应该类似于以下代码:

import numpy
from numbapro import cuda

@cuda.autojit
def foo(aryA, aryB ,out):
# do something here
i = cuda.threadIdx.x + cuda.blockIdx.x * cuda.blockDim.x
out[i] = aryA[i] + aryB[i]

griddim = 1, 2
blockdim = 3, 4
aryA = numpy.arange(10, dtype=numpy.int32)
aryB = numpy.arange(10, dtype=numpy.int32)
out = numpy.empty(10, dtype=numpy.int32)

# transfer memory
d_ary1 = cuda.to_device(aryA)
d_ary2 = cuda.to_device(aryB)
d_out = cuda.device_array_like(aryA) # like numpy.empty_like() but for GPU
# launch kernel
foo[griddim, blockdim](aryA, aryB, d_out)

# transfer memory device to host
d_out.copy_to_host(out)

print out

我建议 NumbaPro 新用户查看 https://github.com/ContinuumIO/numbapro-examples 中的示例.

关于python - numbapro 中的 cuda 代码错误,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/18151011/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com