python - 快速稀疏矩阵乘法，无需分配密集数组-6ren

python - 快速稀疏矩阵乘法，无需分配密集数组

转载作者：行者123 更新时间：2023-12-01 01:59:06

我有一个 m x m 稀疏矩阵相似度和一个包含 m 个元素的向量，combined_scales。我希望将 similarities 中的第 i 列乘以 combined_scales[i]。这是我的第一次尝试:

for i in range(m):
    scale = combined_scales[i]
    similarities[:, i] *= scale

这在语义上是正确的，但表现不佳，所以我尝试将其更改为:

# sparse.diags creates a diagonal matrix.
# docs: https://docs.scipy.org/doc/scipy-0.14.0/reference/generated/scipy.sparse.diags.html
similarities *= sparse.diags(combined_scales)

但是在运行此行时我立即遇到了 MemoryError 。奇怪的是，scipy 似乎试图在这里分配一个密集的 numpy 数组:

Traceback (most recent call last):
  File "main.py", line 108, in <module>
    loop.run_until_complete(main())
  File "C:\Users\james\AppData\Local\Programs\Python\Python36-32\lib\asyncio\base_events.py", line 466, in run_until_complete
    return future.result()
  File "main.py", line 100, in main
    magic.fit(df)
  File "C:\cygwin64\home\james\code\py\relativity\ml.py", line 127, in fit
    self._scale_similarities(X, net_similarities)
  File "C:\cygwin64\home\james\code\py\relativity\ml.py", line 148, in _scale_similarities
    similarities *= sparse.diags(combined_scales)
  File "C:\Users\james\AppData\Local\Programs\Python\Python36-32\lib\site-packages\scipy\sparse\base.py", line 440, in __mul__
    return self._mul_sparse_matrix(other)
  File "C:\Users\james\AppData\Local\Programs\Python\Python36-32\lib\site-packages\scipy\sparse\compressed.py", line 503, in _mul_sparse_matrix
    data = np.empty(nnz, dtype=upcast(self.dtype, other.dtype))
MemoryError

如何防止它在这里分配密集数组？谢谢。

最佳答案

来自sparse.compressed

class _cs_matrix    # common for csr and csc
    def _mul_sparse_matrix(self, other):
        M, K1 = self.shape
        K2, N = other.shape

        major_axis = self._swap((M,N))[0]
        other = self.__class__(other)  # convert to this format

        idx_dtype = get_index_dtype((self.indptr, self.indices,
                                     other.indptr, other.indices),
                                    maxval=M*N)
        indptr = np.empty(major_axis + 1, dtype=idx_dtype)

        fn = getattr(_sparsetools, self.format + '_matmat_pass1')
        fn(M, N,
           np.asarray(self.indptr, dtype=idx_dtype),
           np.asarray(self.indices, dtype=idx_dtype),
           np.asarray(other.indptr, dtype=idx_dtype),
           np.asarray(other.indices, dtype=idx_dtype),
           indptr)

        nnz = indptr[-1]
        idx_dtype = get_index_dtype((self.indptr, self.indices,
                                     other.indptr, other.indices),
                                    maxval=nnz)
        indptr = np.asarray(indptr, dtype=idx_dtype)
        indices = np.empty(nnz, dtype=idx_dtype)
        data = np.empty(nnz, dtype=upcast(self.dtype, other.dtype))

        fn = getattr(_sparsetools, self.format + '_matmat_pass2')
        fn(M, N, np.asarray(self.indptr, dtype=idx_dtype),
           np.asarray(self.indices, dtype=idx_dtype),
           self.data,
           np.asarray(other.indptr, dtype=idx_dtype),
           np.asarray(other.indices, dtype=idx_dtype),
           other.data,
           indptr, indices, data)

        return self.__class__((data,indices,indptr),shape=(M,N))

similarities 是一个稀疏 csr 矩阵。 other，diag 矩阵，也已转换为 csr

other = self.__class__(other)

csr_matmat_pass1(编译代码)使用来自 self 和 other 的索引运行，返回 nnz，即输出中非零项的数量。

然后，它会分配 indptr、indices 和 data 数组来保存 csr_matmat_pass2 的结果。这些用于创建返回矩阵

self.__class__((data,indices,indptr),shape=(M,N))

创建data数组时发生错误:

data = np.empty(nnz, dtype=upcast(self.dtype, other.dtype))

返回结果中包含太多非零值，超出您的内存范围。

什么是m和similarities.nnz？

是否有足够的内存来执行similarities.copy()？

当您使用similarities *= ...时，它首先必须执行similarities * other。然后结果将替换 self。它不会尝试进行就地乘法。

按列就地迭代

关于按行(或列)进行更快的迭代、寻求诸如排序或获取最大行值之类的事情存在很多问题。直接使用 csr 属性可以大大加快速度。我认为这个想法适用于此

示例:

In [275]: A = sparse.random(10,10,.2,'csc').astype(int)
In [276]: A.data[:] = np.arange(1,21)
In [277]: A.A
Out[277]: 
array([[ 0,  0,  4,  0,  0,  0,  0,  0,  0,  0],
       [ 0,  3,  0,  0,  0,  0,  0,  0,  0,  0],
       [ 1,  0,  0,  0,  0, 10,  0,  0, 16, 18],
       [ 0,  0,  0,  0,  0, 11, 14,  0,  0,  0],
       [ 0,  0,  0,  0,  0,  0,  0,  0,  0,  0],
       [ 0,  0,  0,  0,  8,  0,  0,  0,  0,  0],
       [ 0,  0,  0,  0,  9, 12,  0,  0, 17,  0],
       [ 2,  0,  0,  0,  0, 13,  0,  0,  0,  0],
       [ 0,  0,  5,  7,  0,  0,  0, 15,  0, 19],
       [ 0,  0,  6,  0,  0,  0,  0,  0,  0, 20]])
In [280]: B = sparse.diags(np.arange(1,11),dtype=int)
In [281]: B
Out[281]: 
<10x10 sparse matrix of type '<class 'numpy.int64'>'
    with 10 stored elements (1 diagonals) in DIAgonal format>
In [282]: (A*B).A
Out[282]: 
array([[  0,   0,  12,   0,   0,   0,   0,   0,   0,   0],
       [  0,   6,   0,   0,   0,   0,   0,   0,   0,   0],
       [  1,   0,   0,   0,   0,  60,   0,   0, 144, 180],
       [  0,   0,   0,   0,   0,  66,  98,   0,   0,   0],
       [  0,   0,   0,   0,   0,   0,   0,   0,   0,   0],
       [  0,   0,   0,   0,  40,   0,   0,   0,   0,   0],
       [  0,   0,   0,   0,  45,  72,   0,   0, 153,   0],
       [  2,   0,   0,   0,   0,  78,   0,   0,   0,   0],
       [  0,   0,  15,  28,   0,   0,   0, 120,   0, 190],
       [  0,   0,  18,   0,   0,   0,   0,   0,   0, 200]], dtype=int64)

在列上就地迭代:

In [283]: A1=A.copy()
In [284]: for i,j,v in zip(A1.indptr[:-1],A1.indptr[1:],np.arange(1,11)):
     ...:     A1.data[i:j] *= v
     ...:     
In [285]: A1.A
Out[285]: 
array([[  0,   0,  12,   0,   0,   0,   0,   0,   0,   0],
       [  0,   6,   0,   0,   0,   0,   0,   0,   0,   0],
       [  1,   0,   0,   0,   0,  60,   0,   0, 144, 180],
       [  0,   0,   0,   0,   0,  66,  98,   0,   0,   0],
       [  0,   0,   0,   0,   0,   0,   0,   0,   0,   0],
       [  0,   0,   0,   0,  40,   0,   0,   0,   0,   0],
       [  0,   0,   0,   0,  45,  72,   0,   0, 153,   0],
       [  2,   0,   0,   0,   0,  78,   0,   0,   0,   0],
       [  0,   0,  15,  28,   0,   0,   0, 120,   0, 190],
       [  0,   0,  18,   0,   0,   0,   0,   0,   0, 200]])

时间比较:

In [287]: %%timeit A1=A.copy()
     ...: A1 *= B
     ...: 
375 µs ± 1.29 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
In [288]: %%timeit A1 = A.copy()
     ...: for i,j,v in zip(A1.indptr[:-1],A1.indptr[1:],np.arange(1,11)):
     ...:     A1.data[i:j] *= v
     ...:     
79.9 µs ± 1.47 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

关于python - 快速稀疏矩阵乘法，无需分配密集数组，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/49868167/

文章推荐：使用 pip 安装 ruamel.yaml 包时出现 Python 错误

文章推荐： assembly - BIOS int 10h 在 QEMU 上打印垃圾

文章推荐： python - 列表中元组的流量强度天数

无需 video_player 插件即可轻松播放本地视频
我正在制作简单播放本地视频的应用程序。一开始我用https://pub.dev/packages/video_player , video_player: ^0.10.2+1 但是由于某种原因，它在
java - 将所需变量传递给所有代码的最佳设计模式，无需
基本上，我有一个 boolean 值，大型 API 项目中大约 20% 的类都使用它。从实用方法到更大的类，一切都使用它。我可以在程序启动时设置状态(不会改变)，但不知道访问它的“最佳”方式。最初，
python - 无需 for 循环即可高效创建数据框
我正在处理一些广告数据，例如电子邮件数据。我有两个数据集: 邮件级别，针对每个人，说明他们的邮寄日期，以及他们的转换日期。 import pandas as pd df_emailed=pd.Data
excel - 无需 VBA 即可动态创建公式
例如，我在 A 列中输入了数据，在 B 列中输入了一些复杂的公式作为 A 中数据的函数。A 中的数据行数取决于用户输入。它可以在 2 到 100,000 之间。传统上，我将使用相同的公式填充 B 列的
networking - 无需 NTP 的小尺寸时钟同步
我正在寻找一种简单的时钟同步协议(protocol)，该协议(protocol)易于实现且占用空间小，并且在没有互联网连接的情况下也可以工作，因此可以用于例如在封闭的实验室网络中。需要明确的是，我不是
cocoa - 无需 CoreData 即可管理逆向关系
这是 Objective-J/Cappuccino 的问题，但我添加了 cocoa 标签，因为框架非常相似。 Cappuccino 的缺点之一是 CoreData 尚未移植，因此您必须手动创建所有模型
javascript - 无需 kycode 即可识别特殊键字符
例如，如果您按退格键，控制台会显示 keyVal 的空字符串，但这会产生误导，因为 keyVal.length 等于 1 还有一个隐藏字符 element.on('keydown',function(
wordpress - 无需 FTP 上传主题
我已经下载了一个主题，我想安装它。现在我位于“外观”>“主题”>“添加”>“新建/上传主题”。WordPress 需要 FTP 访问。好吧，我在本地计算机上，没有 FTP 服务器正在监听端口 21。
javascript - 根据需要动态创建对象，无需 if 检查
所以我认为我疯了，也许我疯了，但这看起来很简单。假设我有这段代码: let a = {}; a.b.c.d.e.f.g = 'Something Awesome'; 现在您可以想象如果检查噩梦就必须进
azure - 无需 Azure 门户即可获得应用程序见解
已关闭。此问题不符合Stack Overflow guidelines 。目前不接受答案。这个问题似乎不是关于 a specific programming problem, a software
javascript - 无需 jQuery 即可切换复选框选中属性
这个问题已经有答案了: How set item checkbox when i click on element span which have this checkbox? (3 个回答) 已关闭
Javascript 数组过滤器上的多个条件，无需 for 循环
已关闭。这个问题是 not reproducible or was caused by typos 。目前不接受答案。这个问题是由拼写错误或无法再重现的问题引起的。虽然类似的问题可能是 on-top
angularjs - 无需 jQuery 即可访问自定义指令的子元素
我已经为下拉菜单编写了一个自定义指令。这些元素绝对定位在相对定位的父元素内，因此我需要获取下拉触发元素的高度，以便将实际菜单移动到其下方。触发器是指令元素的子元素。我想避免使用成熟的 jQuery，而
javascript - 无需 Ajax 即可捕获表单提交事件
我需要向端点提交表单，但由于我无法控制 CORS header ，因此无法使用 AJAX 执行此操作。我目前正在通过渲染隐藏的 iframe 并将提交作为目标来执行此操作。但我仍然无法捕获该事件(我
json - 流式传输到数组转换器，无需 slurp
我的 JSON 输入: { "Key": "Team", "Value": "AA" } { "Key": "Division", "Value": "BB" } 期望的输出: [
javascript - 无需 jQuery 即可拖动
就目前情况而言，这个问题不太适合我们的问答形式。我们希望答案得到事实、引用资料或专业知识的支持，但这个问题可能会引发辩论、争论、民意调查或扩展讨论。如果您觉得这个问题可以改进并可能重新开放，visit
Javascript:无需 ID 即可获取值
Pair: BUX/TIX Spread: 113 Rate: 10.159/10.272 High/Low: 115 我想获取值 113 和值 115，但
session - 无需 cookie 即可维持状态
我正在尝试了解 IPB 论坛的运作方式。如果我勾选记住我，那么即使我关闭浏览器并重新打开它，我也会保持登录状态。我正在尝试弄清楚这是如何实现的，因为服务器设置的唯一 cookie 在 sessio
macos - YouCompleteMe 无需 MacVim
我一直在阅读有关 VIM 的 youcompleteme 插件的内容。然而，问题是我想要一个可以转移到其他开发平台(OpenIndiana、FreeBSD、Linux 和 OS X)上的设置。使用
excel - 无需 VBA 查找最后一个非空单元格地址
我需要找到 Excel 电子表格中的最后一个非空单元格，但我需要它的地址，而不是它的值。例如:当我想要 K 列中最后一个非空单元格的值时，我使用以下公式: =LOOKUP(2;1/(NOT(ISBL

行者123

个人简介

我是一名优秀的程序员,十分优秀！

作者热门文章

滴滴打车优惠券免费领取

全站热门文章

首页

博学

6Ren·AI

商城

python - 快速稀疏矩阵乘法，无需分配密集数组

按列就地迭代