lstm - Pytorch LSTM : Target Dimension in Calculating Cross Entropy Loss-6ren

lstm - Pytorch LSTM : Target Dimension in Calculating Cross Entropy Loss

转载作者：行者123 更新时间：2023-12-01 01:43:48

25

4

我一直在尝试在 Pytorch 中使用 LSTM(LSTM 后跟自定义模型中的线性层)，但在计算损失时出现以下错误:
Assertion cur_target >= 0 && cur_target < n_classes' failed.
我定义了损失函数:

criterion = nn.CrossEntropyLoss()

然后用

loss += criterion(output, target)

我给目标的维度是 [sequence_length, number_of_classes]，输出的维度是 [sequence_length, 1, number_of_classes]。

我所遵循的示例似乎在做同样的事情，但在 Pytorch docs on cross entropy loss. 上却有所不同。

文档说目标应该是维度 (N)，其中每个值是 0 ≤ targets[i] ≤ C−1 并且 C 是类的数量。我将目标更改为该形式，但现在我收到一条错误消息(序列长度为 75，并且有 55 个类):

Expected target size (75, 55), got torch.Size([75])

我已经尝试查看这两个错误的解决方案，但仍然无法正常工作。我对目标的正确尺寸以及第一个错误背后的实际含义感到困惑(不同的搜索对错误给出了非常不同的含义，没有任何修复工作)。

谢谢

最佳答案

您可以使用 squeeze()在您的 output张量，这将返回一个张量，其中删除了大小为 1 的所有维度。

此简短代码使用您在问题中提到的形状:

sequence_length   = 75
number_of_classes = 55
# creates random tensor of your output shape
output = torch.rand(sequence_length, 1, number_of_classes)
# creates tensor with random targets
target = torch.randint(55, (75,)).long()

# define loss function and calculate loss
criterion = nn.CrossEntropyLoss()
loss = criterion(output, target)
print(loss)

导致您描述的错误:

ValueError: Expected target size (75, 55), got torch.Size([75])

所以使用 squeeze()在您的 output张量通过使其正确形状来解决您的问题。

修正形状的示例:

sequence_length   = 75
number_of_classes = 55
# creates random tensor of your output shape
output = torch.rand(sequence_length, 1, number_of_classes)
# creates tensor with random targets
target = torch.randint(55, (75,)).long()

# define loss function and calculate loss
criterion = nn.CrossEntropyLoss()

# apply squeeze() on output tensor to change shape form [75, 1, 55] to [75, 55]
loss = criterion(output.squeeze(), target)
print(loss)

输出:

tensor(4.0442)

使用 squeeze()从 [75, 1, 55] 改变你的张量形状至 [75, 55]所以输出和目标形状匹配!

你也可以使用其他方法来 reshape 你的张量，重要的是你有 [sequence_length, number_of_classes] 的形状。而不是 [sequence_length, 1, number_of_classes] .

您的目标应该是 LongTensor分别 torch.long 类型的张量包含类。这里的形状是 [sequence_length] .

编辑:
传递给交叉熵函数时上面示例中的形状:

输出: torch.Size([75, 55])目标: torch.Size([75])
下面是一个更一般的示例，CE 的输出和目标应该是什么样的。在这种情况下，我们假设我们有 5 个不同的目标类，长度为 1、2 和 3 的序列有三个示例:

# init CE Loss function
criterion = nn.CrossEntropyLoss()

# sequence of length 1
output = torch.rand(1, 5)
# in this case the 1th class is our target, index of 1th class is 0
target = torch.LongTensor([0])
loss = criterion(output, target)
print('Sequence of length 1:')
print('Output:', output, 'shape:', output.shape)
print('Target:', target, 'shape:', target.shape)
print('Loss:', loss)

# sequence of length 2
output = torch.rand(2, 5)
# targets are here 1th class for the first element and 2th class for the second element
target = torch.LongTensor([0, 1])
loss = criterion(output, target)
print('\nSequence of length 2:')
print('Output:', output, 'shape:', output.shape)
print('Target:', target, 'shape:', target.shape)
print('Loss:', loss)

# sequence of length 3
output = torch.rand(3, 5)
# targets here 1th class, 2th class and 2th class again for the last element of the sequence
target = torch.LongTensor([0, 1, 1])
loss = criterion(output, target)
print('\nSequence of length 3:')
print('Output:', output, 'shape:', output.shape)
print('Target:', target, 'shape:', target.shape)
print('Loss:', loss)

输出:

Sequence of length 1:
Output: tensor([[ 0.1956,  0.0395,  0.6564,  0.4000,  0.2875]]) shape: torch.Size([1, 5])
Target: tensor([ 0]) shape: torch.Size([1])
Loss: tensor(1.7516)

Sequence of length 2:
Output: tensor([[ 0.9905,  0.2267,  0.7583,  0.4865,  0.3220],
        [ 0.8073,  0.1803,  0.5290,  0.3179,  0.2746]]) shape: torch.Size([2, 5])
Target: tensor([ 0,  1]) shape: torch.Size([2])
Loss: tensor(1.5469)

Sequence of length 3:
Output: tensor([[ 0.8497,  0.2728,  0.3329,  0.2278,  0.1459],
        [ 0.4899,  0.2487,  0.4730,  0.9970,  0.1350],
        [ 0.0869,  0.9306,  0.1526,  0.2206,  0.6328]]) shape: torch.Size([3, 5])
Target: tensor([ 0,  1,  1]) shape: torch.Size([3])
Loss: tensor(1.3918)

我希望这有帮助!

关于lstm - Pytorch LSTM : Target Dimension in Calculating Cross Entropy Loss，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/53455780/

25

4

0

文章推荐： javascript - 如何动态向存储在AMP状态的字符串添加新行？

文章推荐： javascript - 将文本从数组添加到 div 并在 X 秒后将其删除

文章推荐： javascript - 使用 ScrollTo 在 jQuery slimScroll 中滚动条不移动

文章推荐： kubernetes - istio 如何将跟踪跨度发送到 jaeger？

pytorch - PyTorch 为什么叫 PyTorch？
已关闭。此问题不符合Stack Overflow guidelines 。目前不接受答案。这个问题似乎与 help center 中定义的范围内的编程无关。 . 已关闭 3 年前。此帖子于去年编辑
pytorch - 验证阶段完成后gpu内存仍然被占用，pytorch
据我所知，在使用 GPU 训练和验证模型时，GPU 内存主要用于加载数据，向前和向后。据我所知，我认为 GPU 内存使用应该相同 1) 训练前，2) 训练后，3) 验证前，4) 验证后。但在我的例子中
pytorch - PyTorch 中复数的矩阵乘法
我正在尝试在 PyTorch 中将两个复数矩阵相乘，看起来 the torch.matmul functions is not added yet to PyTorch library for com
pytorch - Pytorch 中软标签的交叉熵
我正在尝试定义二分类问题的损失函数。但是，目标标签不是硬标签0，1，而是0~1之间的一个 float 。 Pytorch 中的 torch.nn.CrossEntropy 不支持软标签，所以我想自己写
pytorch - PyTorch 数据集应该返回什么？
我正在尝试让 PyTorch 与 DataLoader 一起工作，据说这是处理小批量的最简单方法，在某些情况下这是获得最佳性能所必需的。 DataLoader 需要一个数据集作为输入。大多数关于 D
pytorch - Pytorch DataLoader迭代顺序是否稳定？
Pytorch Dataloader 的迭代顺序是否保证相同(在温和条件下)？例如: dataloader = DataLoader(my_dataset, batch_size=4,
pytorch - Pytorch NLLLOSS的理解
PyTorch 的负对数似然损失，nn.NLLLoss定义为: 因此，如果以单批处理的标准重量计算损失，则损失的公式始终为: -1 * (prediction of model for correct
pytorch - PyTorch:new_ones与1
在PyTorch中，new_ones()与ones()有什么区别。例如， x2.new_ones(3,2, dtype=torch.double) 与 torch.ones(3,2, dtype=to
pytorch - PyTorch 中复杂掩码的最大池化
假设我有一个矩阵 src带形状(5, 3)和一个 bool 矩阵 adj带形状(5, 5)如下， src = tensor([[ 0, 1, 2], [ 3, 4,
pytorch - PyTorch 如何在张量的每一行中随机设置固定数量的元素
我想知道如果不在第 4 行中使用“for”循环，下面的代码是否有更有效的替代方案？ import torch n, d = 37700, 7842 k = 4 sample = torch.cat([
pytorch - PyTorch 中的自定义损失函数
我有三个简单的问题。如果我的自定义损失函数不可微会发生什么？ pytorch 会通过错误还是做其他事情？如果我在我的自定义函数中声明了一个损失变量来表示模型的最终损失，我应该放 requires_
pytorch - PyTorch 中参数与张量的区别
我想知道 PyTorch Parameter 和 Tensor 的区别？现有answer适用于使用变量的旧 PyTorch？最佳答案这就是 Parameter 的全部想法。类(附加)在单个图像中
pytorch - Pytorch 中是否有一种方法可以以可以反向传播的方式计算唯一值的数量？
给定以下张量(这是网络的结果 [注意 grad_fn]): tensor([121., 241., 125., 1., 108., 238., 125., 121., 13., 117., 12
pytorch - Pytorch 线性模块类定义中的常量
什么是__constants__在 pytorch class Linear(Module):定义于 https://pytorch.org/docs/stable/_modules/torch/nn
pytorch - pytorch conv2d的源代码在哪里？
我在哪里可以找到pytorch函数conv2d的源代码？它应该在 torch.nn.functional 中，但我只找到了 _add_docstr 行，如果我搜索conv2d。我在这里看了: ht
pytorch - PyTorch 中的默认膨胀值
如 documentation 中所述在 PyTorch 中，Conv2d 层使用默认膨胀为 1。这是否意味着如果我想创建一个简单的 conv2d 层，我必须编写 nn.conv2d(in_chann
pytorch - PyTorch 如何实现反向卷积？
我阅读了 Pytorch 的源代码，发现它没有实现 convolution_backward 很奇怪。函数，唯一的 convolution_backward_overrideable 函数是直接引发错
pytorch - pytorch 中的一种热门编码
我对编码真的很陌生，现在我正在尝试将我的标签变成一种热门编码。我已经完成将 np.array 传输到张量，如下所示 tensor([4., 4., 4., 4., 4., 4., 4., 4., 4.
pytorch - PyTorch 中用于文本输入的卷积神经网络
我正在尝试实现 text classification model使用CNN。据我所知，对于文本数据，我们应该使用一维卷积。我在 pytorch 中看到了一个使用 Conv2d 的示例，但我想知道如何
pytorch - Pytorch 中类别不平衡的多标签分类
我有一个多标签分类问题，我正试图用 Pytorch 中的 CNN 解决这个问题。我有 80,000 个训练示例和 7900 个类；每个示例可以同时属于多个类，每个示例的平均类数为 130。问题是我的

首页

博学

6Ren·AI

商城

lstm - Pytorch LSTM : Target Dimension in Calculating Cross Entropy Loss