gpt4 book ai didi

python - 从稀疏矩阵的行创建稀疏对角矩阵

转载 作者:太空狗 更新时间:2023-10-30 01:23:54 24 4
gpt4 key购买 nike

我在 Python/Scipy 中处理相当大的矩阵。我需要从大矩阵(加载到 coo_matrix)中提取行并将它们用作对角线元素。目前,我以下列方式进行操作:

import numpy as np
from scipy import sparse

def computation(A):
for i in range(A.shape[0]):
diag_elems = np.array(A[i,:].todense())
ith_diag = sparse.spdiags(diag_elems,0,A.shape[1],A.shape[1], format = "csc")
#...

#create some random matrix
A = (sparse.rand(1000,100000,0.02,format="csc")*5).astype(np.ubyte)
#get timings
profile.run('computation(A)')

我从 profile 输出中看到的是,在提取 diag_elems 时,大部分时间都花在了 get_csr_submatrix 函数上。这让我觉得我要么使用低效的初始数据稀疏表示,要么使用错误的方式从稀疏矩阵中提取行。您能否建议一种更好的方法来从稀疏矩阵中提取一行并以对角线形式表示它?

编辑

以下变体消除了行提取的瓶颈(请注意,简单地将 'csc' 更改为 csr 是不够的,A[i,:] 也必须替换为 A.getrow(i))。然而,主要问题是如何省略具体化 (.todense()) 并根据行的稀疏表示创建对角矩阵。

import numpy as np
from scipy import sparse

def computation(A):
for i in range(A.shape[0]):
diag_elems = np.array(A.getrow(i).todense())
ith_diag = sparse.spdiags(diag_elems,0,A.shape[1],A.shape[1], format = "csc")
#...

#create some random matrix
A = (sparse.rand(1000,100000,0.02,format="csr")*5).astype(np.ubyte)
#get timings
profile.run('computation(A)')

如果我直接从 1 行 CSR 矩阵创建对角矩阵,如下所示:

diag_elems = A.getrow(i)
ith_diag = sparse.spdiags(diag_elems,0,A.shape[1],A.shape[1])

那么我既不能指定 format="csc" 参数,也不能将 ith_diags 转换为 CSC 格式:

Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python2.6/profile.py", line 70, in run
prof = prof.run(statement)
File "/usr/local/lib/python2.6/profile.py", line 456, in run
return self.runctx(cmd, dict, dict)
File "/usr/local/lib/python2.6/profile.py", line 462, in runctx
exec cmd in globals, locals
File "<string>", line 1, in <module>
File "<stdin>", line 4, in computation
File "/usr/local/lib/python2.6/site-packages/scipy/sparse/construct.py", line 56, in spdiags
return dia_matrix((data, diags), shape=(m,n)).asformat(format)
File "/usr/local/lib/python2.6/site-packages/scipy/sparse/base.py", line 211, in asformat
return getattr(self,'to' + format)()
File "/usr/local/lib/python2.6/site-packages/scipy/sparse/dia.py", line 173, in tocsc
return self.tocoo().tocsc()
File "/usr/local/lib/python2.6/site-packages/scipy/sparse/coo.py", line 263, in tocsc
data = np.empty(self.nnz, dtype=upcast(self.dtype))
File "/usr/local/lib/python2.6/site-packages/scipy/sparse/sputils.py", line 47, in upcast
raise TypeError,'no supported conversion for types: %s' % args
TypeError: no supported conversion for types: object`

最佳答案

这是我想出的:

def computation(A):
for i in range(A.shape[0]):
idx_begin = A.indptr[i]
idx_end = A.indptr[i+1]
row_nnz = idx_end - idx_begin
diag_elems = A.data[idx_begin:idx_end]
diag_indices = A.indices[idx_begin:idx_end]
ith_diag = sparse.csc_matrix((diag_elems, (diag_indices, diag_indices)),shape=(A.shape[1], A.shape[1]))
ith_diag.eliminate_zeros()

Python 分析器说 1.464 秒,而之前是 5.574 秒。它利用定义稀疏矩阵的底层密集数组(indptr、索引、数据)。这是我的速成类(class):A.indptr[i]:A.indptr[i+1] 定义密集数组中的哪些元素对应于第 i 行中的非零值。 A.data 是非零值的密集一维数组,A 和 A.indptr 是这些值所在的列。

我会做一些更多的测试,以确保它与以前一样做同样的事情。我只检查了几个案例。

关于python - 从稀疏矩阵的行创建稀疏对角矩阵,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/8339299/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com