python-3.x - torch.mm、torch.matmul 和 torch.mul 有什么区别？-6ren

python-3.x - torch.mm、torch.matmul 和 torch.mul 有什么区别？

转载作者：行者123 更新时间：2023-12-05 03:17:46

阅读pytorch文档后，我仍然需要帮助来理解torch.mm、torch.matmul和torch.mul之间的区别.由于我不完全理解它们，我无法简明扼要地解释这一点。

B = torch.tensor([[ 1.1207],
        [-0.3137],
        [ 0.0700],
        [ 0.8378]])

C = torch.tensor([[ 0.5146,  0.1216, -0.5244,  2.2382]])

print(torch.mul(B,C))

print(torch.matmul(B,C))

print(torch.mm(B,C))

所有三个都产生以下输出(即它们执行矩阵乘法):

tensor([[ 0.5767,  0.1363, -0.5877,  2.5084],
        [-0.1614, -0.0381,  0.1645, -0.7021],
        [ 0.0360,  0.0085, -0.0367,  0.1567],
        [ 0.4311,  0.1019, -0.4393,  1.8752]])

A = torch.tensor([[1.8351,2.1536], [-0.8320,-1.4578]])
B = torch.tensor([[2.9355, 0.3450], [0.5708, 1.9957]])
print(torch.mul(A,B))
print(torch.matmul(A,B))
print(torch.mm(A,B))

产生了

不同输出。 torch.mm 不再执行矩阵乘法(改为广播并执行逐元素乘法，而其他两个仍执行矩阵乘法。

tensor([[ 5.3869,  0.7430],
        [-0.4749, -2.9093]])
tensor([[ 6.6162,  4.9310],
        [-3.2744, -3.1964]])
tensor([[ 6.6162,  4.9310],
        [-3.2744, -3.1964]])

输入

tensor1 = torch.randn(10, 3, 4)
tensor2 = torch.randn(4)

tensor1 = 
tensor([[[-0.2267,  0.6311, -0.5689,  1.2712],
         [-0.0241, -0.5362,  0.5481, -0.4534],
         [-0.9773, -0.6842,  0.6927,  0.3363]],

        [[-2.6759,  0.7817,  2.6821,  0.7037],
         [ 0.1804,  0.3938, -1.2235,  0.8729],
         [-1.9873, -0.5030,  0.0945,  0.2688]],

        [[ 0.4244,  1.7350,  0.0558, -0.1861],
         [-0.9063, -0.4737, -0.4284, -0.3883],
         [ 0.4827, -0.2628,  1.0084,  0.2769]],

        [[ 0.2939,  0.4604,  0.8014, -1.8760],
         [ 1.8807,  0.1623,  0.2344, -0.6221],
         [ 1.3964,  3.1637,  0.7889,  0.1195]],

        [[-0.7202,  1.4250,  2.4302,  1.4811],
         [-0.2301,  0.6280,  0.5379,  0.5178],
         [-2.1073, -1.4399, -0.9451,  0.8534]],

        [[ 2.8178, -0.4451, -0.7871, -0.5198],
         [ 0.2825,  1.0692,  0.1559,  1.2945],
         [-0.5828, -1.6287, -2.0661, -0.4107]],

        [[ 0.5077, -0.6349, -0.0160, -0.4477],
         [-0.8070,  0.3746,  1.1852,  0.0351],
         [-0.6454,  1.5877,  0.8561,  1.1021]],

        [[ 0.1191,  1.0116,  0.5807,  1.2105],
         [-0.5403,  1.2404,  1.1532,  0.6537],
         [ 1.4757, -1.3648, -1.7158, -1.0289]],

        [[-0.1326,  0.3715,  0.2429, -0.0794],
         [ 0.3224, -0.3064,  0.1963,  0.7276],
         [ 0.9098,  1.5984, -1.4953,  0.0420]],

        [[ 0.1511,  0.9691, -0.5204,  0.3858],
         [ 0.4566,  1.5482, -0.3401,  0.5960],
         [-0.9998,  0.7198,  0.9286,  0.4498]]])

tensor2 =
tensor([-1.6350,  1.0335, -0.9023,  0.0696])

print(torch.mul(tensor1,tensor2))
print(torch.matmul(tensor1,tensor2))
print(torch.mm(tensor1,tensor2))

输出完全不同。我认为 torch.mul 广播并将矩阵的每 4 个元素乘以向量 tensor2，即 [-0.2267, 0.6311, -0.5689, 1.2712] x 张量 2 元素-wise, [-0.0241, -0.5362, 0.5481, -0.4534] x tensor 2 element-wise 等等。我不明白 torch.matmul 在做什么。我认为这与文档的第 5 个要点有关(如果两个参数......)，但我无法理解这一点。 https://pytorch.org/docs/stable/generated/torch.matmul.html

我认为 torch.mm 无法产生输出的原因是它无法广播(如果我错了请纠正我)。

tensor([[[ 3.7071e-01,  6.5221e-01,  5.1335e-01,  8.8437e-02],
         [ 3.9400e-02, -5.5417e-01, -4.9460e-01, -3.1539e-02],
         [ 1.5979e+00, -7.0715e-01, -6.2499e-01,  2.3398e-02]],

        [[ 4.3752e+00,  8.0790e-01, -2.4201e+00,  4.8957e-02],
         [-2.9503e-01,  4.0699e-01,  1.1040e+00,  6.0723e-02],
         [ 3.2494e+00, -5.1981e-01, -8.5253e-02,  1.8701e-02]],

        [[-6.9397e-01,  1.7931e+00, -5.0379e-02, -1.2945e-02],
         [ 1.4818e+00, -4.8954e-01,  3.8657e-01, -2.7010e-02],
         [-7.8920e-01, -2.7163e-01, -9.0992e-01,  1.9265e-02]],

        [[-4.8055e-01,  4.7582e-01, -7.2309e-01, -1.3051e-01],
         [-3.0750e+00,  1.6770e-01, -2.1146e-01, -4.3281e-02],
         [-2.2832e+00,  3.2697e+00, -7.1183e-01,  8.3139e-03]],

        [[ 1.1775e+00,  1.4727e+00, -2.1928e+00,  1.0304e-01],
         [ 3.7617e-01,  6.4900e-01, -4.8534e-01,  3.6025e-02],
         [ 3.4455e+00, -1.4882e+00,  8.5277e-01,  5.9369e-02]],

        [[-4.6072e+00, -4.6005e-01,  7.1024e-01, -3.6160e-02],
         [-4.6191e-01,  1.1051e+00, -1.4067e-01,  9.0053e-02],
         [ 9.5283e-01, -1.6833e+00,  1.8643e+00, -2.8571e-02]],

        [[-8.3005e-01, -6.5622e-01,  1.4461e-02, -3.1148e-02],
         [ 1.3195e+00,  3.8716e-01, -1.0694e+00,  2.4421e-03],
         [ 1.0553e+00,  1.6409e+00, -7.7250e-01,  7.6669e-02]],

        [[-1.9477e-01,  1.0455e+00, -5.2398e-01,  8.4209e-02],
         [ 8.8343e-01,  1.2820e+00, -1.0405e+00,  4.5478e-02],
         [-2.4128e+00, -1.4106e+00,  1.5482e+00, -7.1578e-02]],

        [[ 2.1675e-01,  3.8391e-01, -2.1914e-01, -5.5219e-03],
         [-5.2707e-01, -3.1668e-01, -1.7711e-01,  5.0619e-02],
         [-1.4876e+00,  1.6520e+00,  1.3493e+00,  2.9198e-03]],

        [[-2.4706e-01,  1.0015e+00,  4.6955e-01,  2.6842e-02],
         [-7.4663e-01,  1.6001e+00,  3.0685e-01,  4.1462e-02],
         [ 1.6347e+00,  7.4395e-01, -8.3792e-01,  3.1291e-02]]])
tensor([[ 1.6247, -1.0409,  0.2891],
        [ 2.8120,  1.2767,  2.6630],
        [ 1.0358,  1.3518, -1.9515],
        [-0.8583, -3.1620,  0.2830],
        [ 0.5605,  0.5759,  2.8694],
        [-4.3932,  0.5925,  1.1053],
        [-1.5030,  0.6397,  2.0004],
        [ 0.4109,  1.1704, -2.3467],
        [ 0.3760, -0.9702,  1.5165],
        [ 1.2509,  1.2018,  1.5720]])

最佳答案

简而言之:

torch.mm - 执行矩阵乘法无需广播 - (2D 张量) by (2D 张量)
torch.mul - 执行elementwise乘法广播 - (Tensor)by(张量或数字)
torch.matmul - 矩阵乘积带广播 - (张量)由(张量)根据张量形状(点产品、矩阵产品、批量矩阵产品)。

一些细节:

torch.mm - 执行矩阵乘法无需广播

它需要两个二维张量，所以 n×m * m×p = n×p

来自文档 https://pytorch.org/docs/stable/generated/torch.mm.html :

This function does not broadcast. For broadcasting matrix products, see torch.matmul().

torch.mul - 执行elementwise乘法广播 - (Tensor)by(张量或数字)

文档:https://pytorch.org/docs/stable/generated/torch.mul.html

torch.mul 不执行矩阵乘法。它广播两个张量并执行逐元素乘法。因此，当您将它与张量 1x4 * 4x1 一起使用时，它的工作方式类似于:

import torch

a = torch.FloatTensor([[1], [2], [3]])
b = torch.FloatTensor([[1, 10, 100]])
a, b = torch.broadcast_tensors(a, b)
print(a)
print(b)
print(a * b)

tensor([[1., 1., 1.],
        [2., 2., 2.],
        [3., 3., 3.]])
tensor([[  1.,  10., 100.],
        [  1.,  10., 100.],
        [  1.,  10., 100.]])
tensor([[  1.,  10., 100.],
        [  2.,  20., 200.],
        [  3.,  30., 300.]])

torch.matmul

还是看官方文档比较好https://pytorch.org/docs/stable/generated/torch.matmul.html因为它根据输入张量使用不同的模式。它可以通过广播执行点积、矩阵-矩阵积或批量矩阵积。

关于您关于产品的问题:

tensor1 = torch.randn(10, 3, 4)
tensor2 = torch.randn(4)

它是一个产品的批处理版本。请检查这个简单的例子来理解:

import torch

# 3x1x3
a = torch.FloatTensor([[[1, 2, 3]], [[3, 4, 5]], [[6, 7, 8]]])
# 3
b = torch.FloatTensor([1, 10, 100])
r1 = torch.matmul(a, b)

r2 = torch.stack((
    torch.matmul(a[0], b),
    torch.matmul(a[1], b),
    torch.matmul(a[2], b),
))
assert torch.allclose(r1, r2)

因此它可以被看作是多个操作在 batch 维度上堆叠在一起。

阅读有关广播的内容可能也很有用:

https://pytorch.org/docs/stable/notes/broadcasting.html#broadcasting-semantics

关于python-3.x - torch.mm、torch.matmul 和 torch.mul 有什么区别？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/73924697/

文章推荐：通知中未显示 Android 12 启动画面图标

文章推荐： c++ - 在 OpenGL 4 中绘制一个简单的矩形

python - Python 中的集群或合并集群以减少组数 (Python)
我正在处理一组标记为 160 个组的 173k 点。我想通过合并最接近的(到 9 或 10 个组)来减少组/集群的数量。我搜索过 sklearn 或类似的库，但没有成功。我猜它只是通过 knn 聚类
python - python 列表的子集基于同一列表的元素组，pythonically
我有一个扁平数字列表，这些数字逻辑上以 3 为一组，其中每个三元组是 (number, __ignored, flag[0 or 1])，例如: [7,56,1, 8,0,0, 2,0,0, 6,1,
python - 激活 Python 虚拟环境并在另一个 Python 脚本中调用 Python 脚本
我正在使用 pipenv 来管理我的包。我想编写一个 python 脚本来调用另一个使用不同虚拟环境(VE)的 python 脚本。如何运行使用 VE1 的 python 脚本 1 并调用另一个 p
python - 在焕然一新的 Python 环境中以编程方式从 Python 内部执行 Python 文件
假设我有一个文件 script.py 位于 path = "foo/bar/script.py"。我正在寻找一种在 Python 中通过函数 execute_script() 从我的主要 Python
python - 从 python 脚本但在 python 脚本之外运行 python 脚本
这听起来像是谜语或笑话，但实际上我还没有找到这个问题的答案。问题到底是什么？我想运行 2 个脚本。在第一个脚本中，我调用另一个脚本，但我希望它们继续并行，而不是在两个单独的线程中。主要是我不希望第
python - 使用不同的 python 从 python 运行 python 脚本
我有一个带有 python 2.5.5 的软件。我想发送一个命令，该命令将在 python 2.7.5 中启动一个脚本，然后继续执行该脚本。我试过用 #!python2.7.5 和http://re
python - 为什么从 Python 命令行调用 Python 时 Python 无法找到并运行我的脚本？
我在 python 命令行(使用 python 2.7)中，并尝试运行 Python 脚本。我的操作系统是 Windows 7。我已将我的目录设置为包含我所有脚本的文件夹，使用: os.chdir("
python - 使用动态版本的 Python 执行嵌入的 Python 代码时出现致命的 Python 错误
剧透:部分解决(见最后)。以下是使用 Python 嵌入的代码示例: #include int main(int argc, char** argv) { Py_SetPythonHome
python - python 中识别 python 数组或列表中最大累积差异的最快方法是什么？
假设我有以下列表，对应于及时的股票价格: prices = [1, 3, 7, 10, 9, 8, 5, 3, 6, 8, 12, 9, 6, 10, 13, 8, 4, 11] 我想确定以下总体上最
python - (Python) 通过单选按钮 python 更新背景
所以我试图在选择某个单选按钮时更改此框架的背景。我的框架位于一个类中，并且单选按钮的功能位于该类之外。 (这样我就可以在所有其他框架上调用它们。) 问题是每当我选择单选按钮时都会出现以下错误: co
python - python 中的字符串与正则表达式比较在 python 中失败
我正在尝试将字符串与 python 中的正则表达式进行比较，如下所示， #!/usr/bin/env python3 import re str1 = "Expecting property name
python - python 如何加载Boost.Python 库？
考虑以下原型(prototype) Boost.Python 模块，该模块从单独的 C++ 头文件中引入类“D”。 /* file: a/b.cpp */ BOOST_PYTHON_MODULE(c)
python - python 检查模块 python 的问题
如何编写一个程序来“识别函数调用的行号？” python 检查模块提供了定位行号的选项，但是， def di(): return inspect.currentframe().f_back.f_l
python - 系统 python 与用户 python
我已经使用 macports 安装了 Python 2.7，并且由于我的 $PATH 变量，这就是我输入 $ python 时得到的变量。然而，virtualenv 默认使用 Python 2.6，除
python - [Python] : Python re. 长字符串行的搜索速度优化
我只想问如何加快 python 上的 re.search 速度。我有一个很长的字符串行，长度为 176861(即带有一些符号的字母数字字符)，我使用此函数测试了该行以进行研究: def getExe
python - 编辑字符串 python 正则表达式 python
list1= [u'%app%%General%%Council%', u'%people%', u'%people%%Regional%%Council%%Mandate%', u'%ppp%%Ge
python - Python 映射中的副作用(Python "do" block )
这个问题在这里已经有了答案: Is it Pythonic to use list comprehensions for just side effects? (7 个答案) 关闭 4 个月前。告
python - 使用其值逻辑组合两个 python 列表 - Python
我想用 Python 将两个列表组合成一个列表，方法如下: a = [1,1,1,2,2,2,3,3,3,3] b= ["Sun", "is", "bright", "June","and" ,"Ju
python - Boost.Python python 链接错误
我正在运行带有最新 Boost 发行版 (1.55.0) 的 Mac OS X 10.8.4 (Darwin 12.4.0)。我正在按照说明 here构建包含在我的发行版中的教程 Boost-Pyth
python - 在 Python 中仅使用内置库制作一个基本的网络抓取工具 - Python
学习 Python，我正在尝试制作一个没有任何第 3 方库的网络抓取工具，这样过程对我来说并没有简化，而且我知道我在做什么。我浏览了一些在线资源，但所有这些都让我对某些事情感到困惑。 html 看起来

行者123

个人简介

我是一名优秀的程序员,十分优秀！

作者热门文章

滴滴打车优惠券免费领取

全站热门文章

首页

博学

6Ren·AI

商城

python-3.x - torch.mm、torch.matmul 和 torch.mul 有什么区别？