algorithm - 在 Theano 中循环(或向量化)可变长度矩阵-6ren

algorithm - 在 Theano 中循环(或向量化)可变长度矩阵

转载作者：塔克拉玛干更新时间：2023-11-03 02:45:30

24

4

我有一个矩阵列表 L，其中每个项目 M 都是一个 x*n 矩阵(x 是一个变量，n 是一个常量)。

我想为 L 中的所有项目计算 M'*M 的总和(M' 是 的转置M) 就像下面的 Python 代码一样:

for M in L:
  res += np.dot(M.T, M)

实际上我想在 Theano 中实现这个(它不支持可变长度的多维数组)，我不想将所有矩阵填充到相同的大小，因为那样会浪费太多空间(有些矩阵可以非常大)。

有更好的方法吗？

编辑:

L 在 Theano 编译之前是已知的。

编辑:

收到来自@DanielRenshaw 和@Divakar 的两个优秀答案，情绪上很难选择一个接受。

最佳答案

鉴于矩阵的数量在需要进行 Theano 编译之前是已知的，因此可以简单地使用 Theano 矩阵的常规 Python 列表。

这是一个完整的示例，显示了 numpy 和 Theano 版本之间的区别。

此代码已更新，包括与@Divakar 的向量化方法的比较，后者性能更好。 Theano 有两种向量化方法，一种是 Theano 执行串联，另一种是 numpy 执行串联，然后将结果传递给 Theano。

import timeit
import numpy as np
import theano
import theano.tensor as tt


def compile_theano_version1(number_of_matrices, n, dtype):
    assert number_of_matrices > 0
    assert n > 0
    L = [tt.matrix() for _ in xrange(number_of_matrices)]
    res = tt.zeros(n, dtype=dtype)
    for M in L:
        res += tt.dot(M.T, M)
    return theano.function(L, res)


def compile_theano_version2(number_of_matrices):
    assert number_of_matrices > 0
    L = [tt.matrix() for _ in xrange(number_of_matrices)]
    concatenated_L = tt.concatenate(L, axis=0)
    res = tt.dot(concatenated_L.T, concatenated_L)
    return theano.function(L, res)


def compile_theano_version3():
    concatenated_L = tt.matrix()
    res = tt.dot(concatenated_L.T, concatenated_L)
    return theano.function([concatenated_L], res)


def numpy_version1(*L):
    assert len(L) > 0
    n = L[0].shape[1]
    res = np.zeros((n, n), dtype=L[0].dtype)
    for M in L:
        res += np.dot(M.T, M)
    return res


def numpy_version2(*L):
    concatenated_L = np.concatenate(L, axis=0)
    return np.dot(concatenated_L.T, concatenated_L)


def main():
    iteration_count = 100
    number_of_matrices = 20
    n = 300
    min_x = 400
    dtype = 'float64'
    theano_version1 = compile_theano_version1(number_of_matrices, n, dtype)
    theano_version2 = compile_theano_version2(number_of_matrices)
    theano_version3 = compile_theano_version3()
    L = [np.random.standard_normal(size=(x, n)).astype(dtype)
         for x in range(min_x, number_of_matrices + min_x)]

    start = timeit.default_timer()
    numpy_res1 = np.sum(numpy_version1(*L)
                        for _ in xrange(iteration_count))
    print 'numpy_version1', timeit.default_timer() - start

    start = timeit.default_timer()
    numpy_res2 = np.sum(numpy_version2(*L)
                        for _ in xrange(iteration_count))
    print 'numpy_version2', timeit.default_timer() - start

    start = timeit.default_timer()
    theano_res1 = np.sum(theano_version1(*L)
                         for _ in xrange(iteration_count))
    print 'theano_version1', timeit.default_timer() - start

    start = timeit.default_timer()
    theano_res2 = np.sum(theano_version2(*L)
                         for _ in xrange(iteration_count))
    print 'theano_version2', timeit.default_timer() - start

    start = timeit.default_timer()
    theano_res3 = np.sum(theano_version3(np.concatenate(L, axis=0))
                         for _ in xrange(iteration_count))
    print 'theano_version3', timeit.default_timer() - start

    assert np.allclose(numpy_res1, numpy_res2)
    assert np.allclose(numpy_res2, theano_res1)
    assert np.allclose(theano_res1, theano_res2)
    assert np.allclose(theano_res2, theano_res3)


main()

当运行这个打印时(类似的东西)

numpy_version1 1.47830819649
numpy_version2 1.77405482179
theano_version1 1.3603150303
theano_version2 1.81665318145
theano_version3 1.86912039489

断言通过，表明 Theano 和 numpy 版本都以高精度计算相同的结果。显然，如果使用 float32 而不是 float64，这种精度会降低。

时序结果表明向量化方法可能并不可取，这取决于矩阵大小。在上面的例子中，矩阵很大，非串联方法更快，但是如果在 main 函数中更改了 n 和 min_x 参数小得多，然后连接方法更快。在 GPU 上运行时可能会出现其他结果(仅限 Theano 版本)。

关于algorithm - 在 Theano 中循环(或向量化)可变长度矩阵，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/34511680/

24

4

0

文章推荐： algorithm - 基于1对1选择的协同排序算法

文章推荐： url-rewriting - 在 Joomla 3 组件中将 GET 方法 URL 更改为 SEO URL

文章推荐： wordpress - 无法使用以问号开头的网址抛出 404 错误

文章推荐： algorithm - O(n) 运行时算法

theano - Theano 中的索引
如何通过索引向量在 Theano 中索引矩阵？更准确地说: v 的类型为 theano.tensor.vector(例如 [0,2]) A 具有 theano.tensor.matrix 类型(例如
theano - theano 函数的错误输入参数
我是theano的新手。我正在尝试实现简单的线性回归，但我的程序抛出以下错误: TypeError: ('Bad input argument to theano function with name
theano - 如何在不重建图形的情况下重用具有不同共享变量的 Theano 函数？
我有一个被多次调用的 Theano 函数，每次都使用不同的共享变量。按照现在的实现方式，Theano 函数在每次运行时都会重新定义。我假设，这会使整个程序变慢，因为每次定义 Theano 函数时，都会
theano - Theano.function中 'givens'变量的用途
我正在阅读http://deeplearning.net/tutorial/logreg.html给出的逻辑函数代码。我对函数的inputs和givens变量之间的区别感到困惑。计算微型批次中的模型所
theano - 如何设置 theano 配置
我是 Theano 的新手。尝试设置配置文件。首先，我注意到我没有 .theanorc 文件: locate .theanorc - 不返回任何内容 echo $THEANORC - 不返回任何内
theano - 为什么我们需要 Theano reshape ？
我不明白为什么我们在 Theano 中需要 tensor.reshape() 函数。文档中说: Returns a view of this tensor that has been reshaped
theano - 如何在 Theano 中翻转张量？
给定一个张量 v = t.vector()，我该如何翻转它？例如，[1, 2, 3, 4, 5, 6] 翻转后是 [6, 5, 4, 3, 2, 1]。最佳答案您可以简单地执行 v[::-1].e
theano - 为什么 theano 运行这么慢？
我是 Theano 的新手，正在尝试一些示例。 import numpy import theano.tensor as T from theano import function import da
theano - 定期记录梯度而不需要 Theano 中的两个函数(或减速)
出于诊断目的，我定期获取网络的梯度。一种方法是将梯度作为 theano 函数的输出返回。然而，每次都将梯度从 GPU 复制到 CPU 内存可能代价高昂，所以我宁愿只定期进行。目前，我通过创建两个函数对
theano - 输入维度不匹配二元交叉熵 Lasagne 和 Theano
我阅读了网络上所有关于人们忘记将目标向量更改为矩阵的问题的帖子，由于更改后问题仍然存在，我决定在这里提出我的问题。下面提到了解决方法，但出现了新问题，我感谢您的建议! 使用卷积网络设置和带有 sigm
theano - 在 Theano 中从 scan 调用函数
我需要通过扫描多次执行 theano 函数，以便总结成本函数并将其用于梯度计算。我熟悉执行此操作的深度学习教程，但我的数据切片和其他一些复杂情况意味着我需要做一些不同的事情。下面是我正在尝试做的一个
theano - Caffe 与 Theano MNIST 示例
我正在尝试学习(和比较)不同的深度学习框架，到时候它们是 Caffe 和 Theano。 http://caffe.berkeleyvision.org/gathered/examples/mnist
theano - 当代码几乎相同时，为什么 theano scan 的工作方式不同？
下面的代码: import theano import numpy as np from theano import tensor as T h1=T.as_tensor_variable(np.ze
theano - 使用 Theano/Lasagne 在 ImageNet 等大规模数据集上进行训练的最佳实践？
我发现 Theano/Lasagne 的所有示例都处理像 mnist 和 cifar10 这样的小数据集，它们可以完全加载到内存中。我的问题是如何编写高效的代码来训练大规模数据集？具体来说，为了让
python - Theano:使用 CSV 文件中的数据训练 theano 神经网络
我正在做图像分类，我必须检测图像是否包含飞机。我完成了以下步骤: 1. 从图像数据集中提取特征作为描述符 2. 用 K 完成 - 表示聚类并生成描述符语料库 3.将语料数据在0-1范围内归一化并保存
python - PyMC3 & Theano - Theano 代码在导入 pymc3 后停止工作
一些简单的 theano 代码完美运行，当我导入 pymc3 时停止运行为了重现错误，这里有一些片段: #Initial Theano Code (this works) import the
theano - Theano 中的 1-of-k(one-hot)编码
我在做this对于 NumPy 。 seq 是一个带有索引的列表。 IE。这实现了 1-of-k 编码(也称为 one-hot)。 def 1_of_k(seq, num_classes): nu
theano - 如何将所有批量数据加载到 Keras(Theano 后端)的 GPU 内存中？
Keras 将数据批量加载到 GPU 上(作者注明here)。对于小型数据集，这是非常低效的。有没有办法修改 Keras 或直接调用 Theano 函数(在 Keras 中定义模型之后)以允许将所有
python - Theano 无法使用 theano 配置 cnmem = 1 导入
Theano导入失败，theano配置cnmem = 1 知道如何确保 GPU 完全分配给 theano python 脚本吗？ Note: Display is not used to avoid
Python/Theano : Is it possible to construct truly recursive theano functions?
例如，我可以定义一个递归 Python lambda 函数来计算斐波那契数列，如下所示: fn = lambda z: fn(z-1)+fn(z-2) if z > 1 else z 但是，如果我尝试

首页

博学

6Ren·AI

商城

algorithm - 在 Theano 中循环(或向量化)可变长度矩阵