gpt4 book ai didi

python - 快速稀疏矩阵乘法,无需分配密集数组

转载 作者:行者123 更新时间:2023-12-01 01:59:06 25 4
gpt4 key购买 nike

我有一个 m x m 稀疏矩阵相似度和一个包含 m 个元素的向量,combined_scales。我希望将 similarities 中的第 i 列乘以 combined_scales[i]。这是我的第一次尝试:

for i in range(m):
scale = combined_scales[i]
similarities[:, i] *= scale

这在语义上是正确的,但表现不佳,所以我尝试将其更改为:

# sparse.diags creates a diagonal matrix.
# docs: https://docs.scipy.org/doc/scipy-0.14.0/reference/generated/scipy.sparse.diags.html
similarities *= sparse.diags(combined_scales)

但是在运行此行时我立即遇到了 MemoryError 。奇怪的是,scipy 似乎试图在这里分配一个密集的 numpy 数组:

Traceback (most recent call last):
File "main.py", line 108, in <module>
loop.run_until_complete(main())
File "C:\Users\james\AppData\Local\Programs\Python\Python36-32\lib\asyncio\base_events.py", line 466, in run_until_complete
return future.result()
File "main.py", line 100, in main
magic.fit(df)
File "C:\cygwin64\home\james\code\py\relativity\ml.py", line 127, in fit
self._scale_similarities(X, net_similarities)
File "C:\cygwin64\home\james\code\py\relativity\ml.py", line 148, in _scale_similarities
similarities *= sparse.diags(combined_scales)
File "C:\Users\james\AppData\Local\Programs\Python\Python36-32\lib\site-packages\scipy\sparse\base.py", line 440, in __mul__
return self._mul_sparse_matrix(other)
File "C:\Users\james\AppData\Local\Programs\Python\Python36-32\lib\site-packages\scipy\sparse\compressed.py", line 503, in _mul_sparse_matrix
data = np.empty(nnz, dtype=upcast(self.dtype, other.dtype))
MemoryError

如何防止它在这里分配密集数组?谢谢。

最佳答案

来自sparse.compressed

class _cs_matrix    # common for csr and csc
def _mul_sparse_matrix(self, other):
M, K1 = self.shape
K2, N = other.shape

major_axis = self._swap((M,N))[0]
other = self.__class__(other) # convert to this format

idx_dtype = get_index_dtype((self.indptr, self.indices,
other.indptr, other.indices),
maxval=M*N)
indptr = np.empty(major_axis + 1, dtype=idx_dtype)

fn = getattr(_sparsetools, self.format + '_matmat_pass1')
fn(M, N,
np.asarray(self.indptr, dtype=idx_dtype),
np.asarray(self.indices, dtype=idx_dtype),
np.asarray(other.indptr, dtype=idx_dtype),
np.asarray(other.indices, dtype=idx_dtype),
indptr)

nnz = indptr[-1]
idx_dtype = get_index_dtype((self.indptr, self.indices,
other.indptr, other.indices),
maxval=nnz)
indptr = np.asarray(indptr, dtype=idx_dtype)
indices = np.empty(nnz, dtype=idx_dtype)
data = np.empty(nnz, dtype=upcast(self.dtype, other.dtype))

fn = getattr(_sparsetools, self.format + '_matmat_pass2')
fn(M, N, np.asarray(self.indptr, dtype=idx_dtype),
np.asarray(self.indices, dtype=idx_dtype),
self.data,
np.asarray(other.indptr, dtype=idx_dtype),
np.asarray(other.indices, dtype=idx_dtype),
other.data,
indptr, indices, data)

return self.__class__((data,indices,indptr),shape=(M,N))

similarities 是一个稀疏 csr 矩阵。 otherdiag 矩阵,也已转换为 csr

other = self.__class__(other) 

csr_matmat_pass1(编译代码)使用来自 selfother 的索引运行,返回 nnz,即输出中非零项的数量。

然后,它会分配 indptrindicesdata 数组来保存 csr_matmat_pass2 的结果。这些用于创建返回矩阵

self.__class__((data,indices,indptr),shape=(M,N))

创建data数组时发生错误:

data = np.empty(nnz, dtype=upcast(self.dtype, other.dtype))

返回结果中包含太多非零值,超出您的内存范围。

什么是msimilarities.nnz

是否有足够的内存来执行similarities.copy()

当您使用similarities *= ...时,它首先必须执行similarities * other。然后结果将替换 self。它不会尝试进行就地乘法。

按列就地迭代

关于按行(或列)进行更快的迭代、寻求诸如排序或获取最大行值之类的事情存在很多问题。直接使用 csr 属性可以大大加快速度。我认为这个想法适用于此

示例:

In [275]: A = sparse.random(10,10,.2,'csc').astype(int)
In [276]: A.data[:] = np.arange(1,21)
In [277]: A.A
Out[277]:
array([[ 0, 0, 4, 0, 0, 0, 0, 0, 0, 0],
[ 0, 3, 0, 0, 0, 0, 0, 0, 0, 0],
[ 1, 0, 0, 0, 0, 10, 0, 0, 16, 18],
[ 0, 0, 0, 0, 0, 11, 14, 0, 0, 0],
[ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[ 0, 0, 0, 0, 8, 0, 0, 0, 0, 0],
[ 0, 0, 0, 0, 9, 12, 0, 0, 17, 0],
[ 2, 0, 0, 0, 0, 13, 0, 0, 0, 0],
[ 0, 0, 5, 7, 0, 0, 0, 15, 0, 19],
[ 0, 0, 6, 0, 0, 0, 0, 0, 0, 20]])
In [280]: B = sparse.diags(np.arange(1,11),dtype=int)
In [281]: B
Out[281]:
<10x10 sparse matrix of type '<class 'numpy.int64'>'
with 10 stored elements (1 diagonals) in DIAgonal format>
In [282]: (A*B).A
Out[282]:
array([[ 0, 0, 12, 0, 0, 0, 0, 0, 0, 0],
[ 0, 6, 0, 0, 0, 0, 0, 0, 0, 0],
[ 1, 0, 0, 0, 0, 60, 0, 0, 144, 180],
[ 0, 0, 0, 0, 0, 66, 98, 0, 0, 0],
[ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[ 0, 0, 0, 0, 40, 0, 0, 0, 0, 0],
[ 0, 0, 0, 0, 45, 72, 0, 0, 153, 0],
[ 2, 0, 0, 0, 0, 78, 0, 0, 0, 0],
[ 0, 0, 15, 28, 0, 0, 0, 120, 0, 190],
[ 0, 0, 18, 0, 0, 0, 0, 0, 0, 200]], dtype=int64)

在列上就地迭代:

In [283]: A1=A.copy()
In [284]: for i,j,v in zip(A1.indptr[:-1],A1.indptr[1:],np.arange(1,11)):
...: A1.data[i:j] *= v
...:
In [285]: A1.A
Out[285]:
array([[ 0, 0, 12, 0, 0, 0, 0, 0, 0, 0],
[ 0, 6, 0, 0, 0, 0, 0, 0, 0, 0],
[ 1, 0, 0, 0, 0, 60, 0, 0, 144, 180],
[ 0, 0, 0, 0, 0, 66, 98, 0, 0, 0],
[ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[ 0, 0, 0, 0, 40, 0, 0, 0, 0, 0],
[ 0, 0, 0, 0, 45, 72, 0, 0, 153, 0],
[ 2, 0, 0, 0, 0, 78, 0, 0, 0, 0],
[ 0, 0, 15, 28, 0, 0, 0, 120, 0, 190],
[ 0, 0, 18, 0, 0, 0, 0, 0, 0, 200]])

时间比较:

In [287]: %%timeit A1=A.copy()
...: A1 *= B
...:
375 µs ± 1.29 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
In [288]: %%timeit A1 = A.copy()
...: for i,j,v in zip(A1.indptr[:-1],A1.indptr[1:],np.arange(1,11)):
...: A1.data[i:j] *= v
...:
79.9 µs ± 1.47 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

关于python - 快速稀疏矩阵乘法,无需分配密集数组,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/49868167/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com