gpt4 book ai didi

python - 从 scipy 矩阵中删除行

转载 作者:行者123 更新时间:2023-12-01 02:30:28 38 4
gpt4 key购买 nike

我有一个 scipy 稀疏矩阵 data 和一个整数 n ,它对应于我想要删除的 data 中的一行。要删除这一行,我尝试了以下操作:

data = sparse.csr_matrix(np.delete(np.array(data),n, axis=0))

但是,这产生了这个错误:

Traceback (most recent call last):
File "...", line 260, in <module>
X_labeled = sparse.csr_matrix(np.delete(np.array(X_labeled),n, axis=0))
File "/anaconda3/lib/python3.6/site-packages/scipy/sparse/compressed.py", line 79, in __init__
self._set_self(self.__class__(coo_matrix(arg1, dtype=dtype)))
File "/anaconda3/lib/python3.6/site-packages/scipy/sparse/coo.py", line 177, in __init__
self.row, self.col = M.nonzero()
SystemError: <built-in method nonzero of numpy.ndarray object at 0x113c883f0> returned a result with an error set

当我运行时:

data = np.delete(data.toarray(),n, axis=0)

我收到此错误:

Traceback (most recent call last):
File "...", line 261, in <module>
X_labeled = np.delete(X_labeled.toarray(),n, axis=0)
File "/anaconda3/lib/python3.6/site-packages/numpy/lib/function_base.py", line 4839, in delete
"size %i" % (obj, axis, N))
IndexError: index 86 is out of bounds for axis 0 with size 4

当我运行这个时:

print(type(data))
print(data.shape)
print(data.toarray().shape)

我明白了:

<class 'scipy.sparse.csr.csr_matrix'>
(4, 2740)
(4, 2740)

最佳答案

将稀疏矩阵转换为稠密矩阵的正确方法是使用 toarray,而不是 np.array(...):

In [408]: M = sparse.csr_matrix(np.eye(3))
In [409]: M
Out[409]:
<3x3 sparse matrix of type '<class 'numpy.float64'>'
with 3 stored elements in Compressed Sparse Row format>
In [410]: np.array(M)
Out[410]:
array(<3x3 sparse matrix of type '<class 'numpy.float64'>'
with 3 stored elements in Compressed Sparse Row format>, dtype=object)

这是一个包含稀疏矩阵的单元素对象 dtype 数组 - 未更改。

In [411]: M.toarray()
Out[411]:
array([[ 1., 0., 0.],
[ 0., 1., 0.],
[ 0., 0., 1.]])

delete 适用于这个正确的数组:

In [414]: data = sparse.csr_matrix(np.delete(M.toarray(),1, axis=0))
In [415]: data
Out[415]:
<2x3 sparse matrix of type '<class 'numpy.float64'>'
with 2 stored elements in Compressed Sparse Row format>
In [416]: data.A
Out[416]:
array([[ 1., 0., 0.],
[ 0., 0., 1.]])

索引会做同样的事情:

In [417]: M[[0,2],:]
Out[417]:
<2x3 sparse matrix of type '<class 'numpy.float64'>'
with 2 stored elements in Compressed Sparse Row format>
In [418]: _.A
Out[418]:
array([[ 1., 0., 0.],
[ 0., 0., 1.]])
In [420]: M[np.array([True,False,True]),:].A
Out[420]:
array([[ 1., 0., 0.],
[ 0., 0., 1.]])

我猜索引路线更快,但我们必须对实际大小的数组进行时间测试才能确定。

内部delete相当复杂,但对于某些输入,它会执行类似的操作 - 使用False为要删除的行构造一个 bool 数组。

<小时/>

制作 bool 掩码:

In [421]: mask=np.ones((3,),bool)
In [422]: mask[1]=False
In [423]: M[mask,:].A

关于python - 从 scipy 矩阵中删除行,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/46880944/

38 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com