gpt4 book ai didi

python - 保留一个大的 scipy.sparse.csr_matrix

转载 作者:行者123 更新时间:2023-12-01 09:24:50 26 4
gpt4 key购买 nike

我有一个非常大的稀疏 scipy 矩阵。尝试使用 save_npz 导致出现以下错误:

>>> sp.save_npz('/projects/BIGmatrix.npz',W)
Traceback (most recent call last):
File "/usr/local/lib/python3.5/dist-packages/numpy/lib/npyio.py", line 716, in _savez
pickle_kwargs=pickle_kwargs)
File "/usr/local/lib/python3.5/dist-packages/numpy/lib/format.py", line 597, in write_array
array.tofile(fp)
OSError: 6257005295 requested and 3283815408 written

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python3.5/dist-packages/scipy/sparse/_matrix_io.py", line 78, in save_npz
np.savez_compressed(file, **arrays_dict)
File "/usr/local/lib/python3.5/dist-packages/numpy/lib/npyio.py", line 659, in savez_compressed
_savez(file, args, kwds, True)
File "/usr/local/lib/python3.5/dist-packages/numpy/lib/npyio.py", line 721, in _savez
raise IOError("Failed to write to %s: %s" % (tmpfile, exc))
OSError: Failed to write to /projects/BIGmatrix.npzg6ub_z3y-numpy.npy: 6257005295 requested and 3283815408 written

因此,我想尝试通过 psycopg2 将其持久化到 postgres,但我还没有找到迭代所有非零值的方法,以便我可以将它们持久化为表中的行。

处理此任务的最佳方法是什么?

最佳答案

保存矩阵对象的__dict__中的所有属性,并在加载时重新创建csr_matrix:

from scipy import sparse
import numpy as np

a = np.zeros((1000, 2000))
a[np.random.randint(0, 1000, 100), np.random.randint(0, 2000, 100)] = np.random.randn(100)

b = sparse.csr_matrix(a)

np.savez("tmp", data=b.data, indices=b.indices, indptr=b.indptr, shape=np.array(b.shape))
f = np.load("tmp.npz")
b2 = sparse.csr_matrix((f["data"], f["indices"], f["indptr"]), shape=f["shape"])
(b != b2).sum()

关于python - 保留一个大的 scipy.sparse.csr_matrix,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/50517794/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com