python-3.x - `return_sequences = False` 在 pytorch LSTM 中等效-6ren

python-3.x - `return_sequences = False` 在 pytorch LSTM 中等效

转载作者：行者123 更新时间：2023-12-04 00:56:29

26

4

在 tensorflow/keras 中，我们可以简单地设置 return_sequences = False对于分类/完全连接/激活(softmax/sigmoid)层之前的最后一个 LSTM 层，以摆脱时间维度。

在 PyTorch 中，我没有找到类似的东西。对于分类任务，我不需要序列到序列模型，而是像这样的多对一架构:

这是我的简单双 LSTM 模型。

import torch
from torch import nn

class BiLSTMClassifier(nn.Module):
    def __init__(self):
        super(BiLSTMClassifier, self).__init__()
        self.embedding = torch.nn.Embedding(num_embeddings = 65000, embedding_dim = 64)
        self.bilstm = torch.nn.LSTM(input_size = 64, hidden_size = 8, num_layers = 2,
                                    batch_first = True, dropout = 0.2, bidirectional = True)
        # as we have 5 classes
        self.linear = nn.Linear(8*2*512, 5) # last dimension
    def forward(self, x):
        x = self.embedding(x)
        print(x.shape)
        x, _ = self.bilstm(x)
        print(x.shape)
        x = self.linear(x.reshape(x.shape[0], -1))
        print(x.shape)

# create our model

bilstmclassifier = BiLSTMClassifier()

如果我观察每一层后的形状，

xx = torch.tensor(X_encoded[0]).reshape(1,512)
print(xx.shape) 
# torch.Size([1, 512])
bilstmclassifier(xx)
#torch.Size([1, 512, 64])
#torch.Size([1, 512, 16])
#torch.Size([1, 5])

我该怎么做才能让最后一个 LSTM 返回一个形状为 (1, 16) 的张量而不是 (1, 512, 16) ?

最佳答案

最简单的方法是索引张量:

x = x[:, -1, :]

哪里 x是 RNN 输出。当然，如果 batch_first是 False ，必须使用 x[-1, :, :] (或只是 x[-1] )改为索引到时间轴。事实证明，这与 Tensorflow/Keras 所做的相同。相关代码可以在 K.rnn中找到 here :

last_output = tuple(o[-1] for o in outputs)

请注意，此时的代码使用 time_major数据格式，所以索引在第一个轴上。另外， outputs是一个元组，因为它可以是多个层、状态/单元对等，但它通常是所有时间步长的输出序列。

然后在 RNN 中使用它类如下:

if self.return_sequences:
    output = K.maybe_convert_to_ragged(is_ragged_input, outputs, row_lengths)
else:
    output = last_output

所以总的来说，我们可以看到 return_sequences=False只是使用 outputs[-1] .

关于python-3.x - `return_sequences = False` 在 pytorch LSTM 中等效，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/62204109/

26

4

0

文章推荐： python - 将字典列表转换为嵌套字典列表

文章推荐： php - 如何连接 PhpStorm 和 Xdebug

文章推荐： haskell - 使用镜头测试 map 成员资格

文章推荐： Python pandas 将绝对值加到一个系列中的正数/负数

python - 如何使用 PyTorch 为堆叠式 LSTM 模型执行 return_sequences？
我有一个 Tensorflow/Keras 模型: self.model.add(Bidirectional(LSTM(lstm1_size, input_shape=(
python - 为什么在 ConvLSTM 中设置 return_sequence = False 时会出现错误？
我尝试通过附加三层 ConvLSTM 进行建模，但是当我在第一个 ConvLSTM 中设置 return_sequence = False 时，程序将无法运行。查看模型摘要 Model summar
python-3.x - `return_sequences = False` 在 pytorch LSTM 中等效
在 tensorflow/keras 中，我们可以简单地设置 return_sequences = False对于分类/完全连接/激活(softmax/sigmoid)层之前的最后一个 LSTM 层，
deep-learning - 如何在 Keras 中使用 return_sequences 选项和 TimeDistributed 层？
我有一个像下面这样的对话语料库。我想实现一个预测系统 Action 的 LSTM 模型。系统 Action 被描述为位向量。并且用户输入被计算为一个词嵌入，它也是一个位向量。 t1: user: "D
machine-learning - RNN : What is the use of return_sequences in LSTM layer in Keras Framework
我在 RNN 工作。我有来自某个网站的以下代码行。如果您观察到第二层没有“returnSequence”参数。我假设返回序列是强制性的，因为它应该返回序列。您能告诉我为什么没有定义吗？第一层
python - return_sequence=True 的 LSTM 之后的 Keras Dense 层
我正在尝试重新实现这篇论文1作者在 Keras 中使用 PyTorch 2 。这是网络架构: 到目前为止我所做的是: number_of_output_classes = 1 hidden_size
python - "Invalid shape for y"用于 Keras LSTM w/return_sequences=True(和 sklearn API)
我有一个要分类的序列，使用带有 return_sequences=True 的 Keras LSTM。我有“数据”和“标签”数据集，它们都是相同的形状——二维矩阵，按位置行，按时间间隔列(单元格值是我

首页

博学

6Ren·AI

商城

python-3.x - `return_sequences = False` 在 pytorch LSTM 中等效