gpt4 book ai didi

machine-learning - LSTM中的input_shape和batch_input_shape有什么区别

转载 作者:行者123 更新时间:2023-11-30 08:28:04 26 4
gpt4 key购买 nike

这只是设置同一事物的不同方式还是它们实际上具有不同的含义?和网络配置有关系吗?

在一个简单的例子中,我无法观察到以下之间的任何区别:

model = Sequential()
model.add(LSTM(1, batch_input_shape=(None,5,1), return_sequences=True))
model.add(LSTM(1, return_sequences=False))

model = Sequential()
model.add(LSTM(1, input_shape=(5,1), return_sequences=True))
model.add(LSTM(1, return_sequences=False))

但是,当我将批量大小设置为 12 batch_input_shape=(12,5,1) 并在拟合模型时使用 batch_size=10 时,出现错误。

ValueError: Cannot feed value of shape (10, 5, 1) for Tensor 'lstm_96_input:0', which has shape '(12, 5, 1)'

这显然是有道理的。但是,我认为在模型级别限制批量大小没有意义。

我错过了什么吗?

最佳答案

Is it just a different way of setting the same thing or do they actually have different meanings? Does it have anything to do with network configuration?

是的,它们实际上是等效的,您的实验证实了这一点,另请参阅 this discussion .

However I can see no point in restricting the batch size on model level.

批处理大小限制有时是必要的,我想到的例子是有状态 LSTM,其中批处理中的最后一个单元状态被记住并用于后续批处理的初始化。这确保客户端不会将不同的批量大小输入网络。示例代码:

# Expected input batch shape: (batch_size, timesteps, data_dim)
# Note that we have to provide the full batch_input_shape since the network is stateful.
# the sample of index i in batch k is the follow-up for the sample i in batch k-1.
model = Sequential()
model.add(LSTM(32, return_sequences=True, stateful=True,
batch_input_shape=(batch_size, timesteps, data_dim)))

关于machine-learning - LSTM中的input_shape和batch_input_shape有什么区别,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/49374508/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com