gpt4 book ai didi

python - 是否可以始终依赖 ctypes.data_as 来保留对临时对象的引用?

转载 作者:行者123 更新时间:2023-12-01 04:23:24 25 4
gpt4 key购买 nike

python 传递数组时到后端c++图书馆,可以依赖以下内容吗?这曾经在 python <= 3.6 中工作,但似乎会导致 python >= 3.7 中的零星崩溃:

(这是一个非常简化的“真实”代码版本,其中面向用户的 python 接口(interface)在底层 c++ 库之间来回传递数据)

# a 2d array, possibly not order="F"
xmat = np.ones((16, 32), dtype=np.float64)

# get a pointer to a version of xmat that is guaranteed to have order="F"
# if xmat already has order="F": no temporary
# if not, a temporary copy is made, reordered and a ptr to that returned
xptr = np.asfortranarray(xmat).ctypes.data_as(ctypes.POINTER(ctypes.c_double))

# pass xptr to c++ back-end to do things (expects order="F" data)

据我(目前!)了解 ctypes.data_as should :

Return the data pointer cast to a particular c-types object...

The returned pointer will keep a reference to the array.



还有一个示例显示在创建临时对象的情况下,例如 (a + b).ctypes.data_as(ctypes.c_void_p)使用 data_as是正确的做法。

python >= 3.7似乎 data_as没有保留对临时的引用,并且在上面, xptr最终指向释放的内存......

难道我做错了什么?这是 python >= 3.7 中的错误吗? ?有一个更好的方法吗?

此处给出了一个完整的示例(带有一些额外的样板,将 array 编码为后端库的 struct ):
import numpy as np
import ctypes as ct

lib_REALS_t = ct.c_double
lib_INDEX_t = ct.c_int32
lib_REALS_p = ct.POINTER(lib_REALS_t)

class lib_REALS_array_t(ct.Structure):
_fields_ = [("size", lib_INDEX_t),
("data", lib_REALS_p)]

class lib_t(ct.Structure):
_fields_ = [
("value", lib_REALS_array_t)]

def bug():

libt = lib_t()

# a 2d array, user-specified, possibly not order="F"
xmat = np.ones((16, 32), dtype=np.float64, order="C")

# get a pointer to a version of xmat that is guaranteed to have order="F"
# if xmat already has order="F": no temporary
# if not, a temporary copy is made, reordered and a ptr to that returned
libt.value.size = xmat.size
libt.value.data = np.asfortranarray(xmat).ctypes.data_as(ct.POINTER(lib_REALS_t))

# pass xptr to c++ back-end to do things (expects order="F" data)

# just "simulate" this by trying to access data using the pointer
print(libt.value.data[1])

return


if (__name__ == "__main__"): bug()

对我来说, python <= 3.6打印 1.0 (如预期)而 python >= 3.7打印 6.92213454250094e-310 (即临时必须已被释放,因此指向未初始化的内存)。

最佳答案

列表 [Python 3.Docs]: ctypes - A foreign function library for Python .

经过调查并寻找代码后,我得出了一个结论(我从一开始就凭直觉知道发生了什么)。

好像[SciPy.Docs]: numpy.ndarray.ctypes :

_ctypes.data_as(self, obj)

...

The returned pointer will keep a reference to the array.



具有误导性。保持 引用 表示它将保存数组(内部)缓冲区地址(从某种意义上说 它不会复制内存内容 )和 不是 Python 引用 (Py_XINCREF)。

[Github]: numpy/numpy - numpy/numpy/core/_internal.py :

def data_as(self, obj):
# Comments
return self._ctypes.cast(self._data, obj)


这是对 ctypes.cast 的调用,它只保存源数组的缓冲区地址。

发生的事情是 np.asfortranarray(xmat)创建一个临时数组(即时),然后 ctypes.data_as 返回其缓冲区地址。在该行之后,临时超出范围(其缓冲区也是如此),但仍引用其地址,产生未定义行为( UB )。

v1.15.0 ( [SciPy.Docs]: numpy.ndarray.ctypes( 强调 是我的))提到了:

Be careful using the ctypes attribute - especially on temporary arrays or arrays constructed on the fly. For example, calling (a+b).ctypes.data_as(ctypes.c_void_p) returns a pointer to memory that is invalid because the array created as (a+b) is deallocated before the next Python statement. You can avoid this problem using either c=a+b or ct=(a+b).ctypes. In the latter case, ct will hold a reference to the array until ct is deleted or re-assigned.



但他们后来把它拿出来了(尽管代码没有被修改(关于这种行为))。

要克服错误,“保存”临时数组或 保留 (Python) 引用 给它。在 [SO]: Access violation when trying to read out object created in Python passed to std::vector on C++ side and then returned to Python (@CristiFati's answer) 中遇到了同样的问题.

我稍微更改了您的代码(包括那些可怕的名字:))。

代码00.py:

#!/usr/bin/env python3

import sys
import ctypes as ct
import numpy as np
from collections import defaultdict


DblPtr = ct.POINTER(ct.c_double)

class Struct0(ct.Structure):
_fields_ = [
("size", ct.c_uint32),
("data", DblPtr),
]


class Wrapper(ct.Structure):
_fields_ = [
("value", Struct0),
]


def test_np(np_array, save_intermediary_array):
wrapper = Wrapper()
wrapper.value.size = np_array.size

if save_intermediary_array:
fortran_array = np.asfortranarray(np_array)
wrapper.value.data = fortran_array.ctypes.data_as(DblPtr)
else:
wrapper.value.data = np.asfortranarray(np_array).ctypes.data_as(DblPtr)
#print(wrapper.value.data[0])
return wrapper.value.data[1]


def main(*argv):
dim1, dim0 = 16, 32
mat = np.ones((dim1, dim0), dtype=np.float64, order="C")
print("NumPy CTypes data: {0:}\n{1:}".format(mat.ctypes, mat.ctypes._ctypes))

dd = defaultdict(int)
flag = 0 # Change to 1 to avoid problem
print("Saving intermediary array: {0:d}".format(flag))
for i in range(100):
dd[test_np(mat, flag)] += 1
print("\nResult: {0:}".format(dd))


if __name__ == "__main__":
print("Python {0:s} {1:d}bit on {2:s}\n".format(" ".join(item.strip() for item in sys.version.split("\n")), 64 if sys.maxsize > 0x100000000 else 32, sys.platform))
print("NumPy version: {0:}".format(np.version.version))
main(*sys.argv[1:])
print("\nDone.")

输出 :

e:\Work\Dev\StackOverflow\q059959608>sopr.bat
*** Set shorter prompt to better fit when pasted in StackOverflow (or other) pages ***

[prompt]> "e:\Work\Dev\VEnvs\py_pc064_03.07.06_test0\Scripts\python.exe" code01.py
Python 3.7.6 (tags/v3.7.6:43364a7ae0, Dec 19 2019, 00:42:30) [MSC v.1916 64 bit (AMD64)] 64bit on win32

NumPy version: 1.18.0
NumPy CTypes data: <numpy.core._internal._ctypes object at 0x000001C9744B0348>
<module 'ctypes' from 'c:\\Install\\pc064\\Python\\Python\\03.07.06\\Lib\\ctypes\\__init__.py'>
Saving intermediary array: 0

Result: defaultdict(<class 'int'>, {9.707134377684e-312: 100})

Done.

[prompt]> "e:\Work\Dev\VEnvs\py_pc064_03.07.06_test0\Scripts\python.exe" code01.py
Python 3.7.6 (tags/v3.7.6:43364a7ae0, Dec 19 2019, 00:42:30) [MSC v.1916 64 bit (AMD64)] 64bit on win32

NumPy version: 1.18.0
NumPy CTypes data: <numpy.core._internal._ctypes object at 0x000001842ECA4FC8>
<module 'ctypes' from 'c:\\Install\\pc064\\Python\\Python\\03.07.06\\Lib\\ctypes\\__init__.py'>
Saving intermediary array: 0

Result: defaultdict(<class 'int'>, {1.0: 100})

Done.

[prompt]> "e:\Work\Dev\VEnvs\py_pc064_03.07.06_test0\Scripts\python.exe" code01.py
Python 3.7.6 (tags/v3.7.6:43364a7ae0, Dec 19 2019, 00:42:30) [MSC v.1916 64 bit (AMD64)] 64bit on win32

NumPy version: 1.18.0
NumPy CTypes data: <numpy.core._internal._ctypes object at 0x000001AD586E91C8>
<module 'ctypes' from 'c:\\Install\\pc064\\Python\\Python\\03.07.06\\Lib\\ctypes\\__init__.py'>
Saving intermediary array: 0

Result: defaultdict(<class 'int'>, {9.110668798574e-312: 100})

Done.

[prompt]> "e:\Work\Dev\VEnvs\py_pc064_03.07.06_test0\Scripts\python.exe" code01.py
Python 3.7.6 (tags/v3.7.6:43364a7ae0, Dec 19 2019, 00:42:30) [MSC v.1916 64 bit (AMD64)] 64bit on win32

NumPy version: 1.18.0
NumPy CTypes data: <numpy.core._internal._ctypes object at 0x0000012F903A9188>
<module 'ctypes' from 'c:\\Install\\pc064\\Python\\Python\\03.07.06\\Lib\\ctypes\\__init__.py'>
Saving intermediary array: 0

Result: defaultdict(<class 'int'>, {6.44158096444e-312: 100})

Done.


备注 :
  • 正如所见,结果非常随机,通常是 UB 指标
  • 有趣的是,在同一次运行中,它总是相同的值(默认字典只有一项)
  • 将标志更改为 1(或任何计算结果为 True)将使问题消失
  • 关于python - 是否可以始终依赖 ctypes.data_as 来保留对临时对象的引用?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/59959608/

    25 4 0
    Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
    广告合作:1813099741@qq.com 6ren.com