gpt4 book ai didi

python - Tensorflow:如何将上一个时间步的输出作为输入传递到下一个时间步

转载 作者:太空狗 更新时间:2023-10-29 18:09:20 27 4
gpt4 key购买 nike

这是这个问题How can I feed last output y(t-1) as input for generating y(t) in tensorflow RNN?的副本

我想将 RNN 在时间步 T 的输出作为时间步 T+1 的输入传递。 input_RNN(T+1) = output_RNN(T)根据文档,tf.nn.rnn 和 tf.nn.dynamic_rnn 函数明确地将完整输入带到所有时间步长。

我检查了 https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/ops/seq2seq.py 上的 seq2seq 示例它使用循环并调用 cell(input,state) 函数。该单元格可以是 lstm 或 gru 或任何其他 rnn 单元格。我检查了文档以找到 cell() 参数的数据类型和形状,但我只找到了表格 cell(num_neurons) 的构造函数。我想知道将输出传递给输入的正确方法。我不想使用其他库/包装器,例如基于 tensorflow 构建的 keras。有什么建议吗?

最佳答案

实现此目的的一种方法是编写您自己的 RNN 单元以及您自己的多 RNN 单元。通过这种方式,您可以在内部存储最后一个 RNN 单元的输出,并在下一个时间步中访问它。检查这个blogpost获取更多信息。您还可以添加例如编码器或解码器直接在单元格中,以便您可以在将数据提供给单元格之前或从单元格中检索数据之后处理数据。

另一种可能性是使用函数 tf.nn.raw_rnn,它可以让您控制调用 RNN 单元之前和之后发生的事情。以下代码片段显示了如何使用此功能,致谢转至 this article .

from tensorflow.python.ops.rnn import _transpose_batch_time
import tensorflow as tf


def sampling_rnn(self, cell, initial_state, input_, seq_lengths):

# raw_rnn expects time major inputs as TensorArrays
max_time = ... # this is the max time step per batch
inputs_ta = tf.TensorArray(dtype=tf.float32, size=max_time, clear_after_read=False)
inputs_ta = inputs_ta.unstack(_transpose_batch_time(input_)) # model_input is the input placeholder
input_dim = input_.get_shape()[-1].value # the dimensionality of the input to each time step
output_dim = ... # the dimensionality of the model's output at each time step

def loop_fn(time, cell_output, cell_state, loop_state):
"""
Loop function that allows to control input to the rnn cell and manipulate cell outputs.
:param time: current time step
:param cell_output: output from previous time step or None if time == 0
:param cell_state: cell state from previous time step
:param loop_state: custom loop state to share information between different iterations of this loop fn
:return: tuple consisting of
elements_finished: tensor of size [bach_size] which is True for sequences that have reached their end,
needed because of variable sequence size
next_input: input to next time step
next_cell_state: cell state forwarded to next time step
emit_output: The first return argument of raw_rnn. This is not necessarily the output of the RNN cell,
but could e.g. be the output of a dense layer attached to the rnn layer.
next_loop_state: loop state forwarded to the next time step
"""
if cell_output is None:
# time == 0, used for initialization before first call to cell
next_cell_state = initial_state
# the emit_output in this case tells TF how future emits look
emit_output = tf.zeros([output_dim])
else:
# t > 0, called right after call to cell, i.e. cell_output is the output from time t-1.
# here you can do whatever ou want with cell_output before assigning it to emit_output.
# In this case, we don't do anything
next_cell_state = cell_state
emit_output = cell_output

# check which elements are finished
elements_finished = (time >= seq_lengths)
finished = tf.reduce_all(elements_finished)

# assemble cell input for upcoming time step
current_output = emit_output if cell_output is not None else None
input_original = inputs_ta.read(time) # tensor of shape (None, input_dim)

if current_output is None:
# this is the initial step, i.e. there is no output from a previous time step, what we feed here
# can highly depend on the data. In this case we just assign the actual input in the first time step.
next_in = input_original
else:
# time > 0, so just use previous output as next input
# here you could do fancier things, whatever you want to do before passing the data into the rnn cell
# if here you were to pass input_original than you would get the normal behaviour of dynamic_rnn
next_in = current_output

next_input = tf.cond(finished,
lambda: tf.zeros([self.batch_size, input_dim], dtype=tf.float32), # copy through zeros
lambda: next_in) # if not finished, feed the previous output as next input

# set shape manually, otherwise it is not defined for the last dimensions
next_input.set_shape([None, input_dim])

# loop state not used in this example
next_loop_state = None
return (elements_finished, next_input, next_cell_state, emit_output, next_loop_state)

outputs_ta, last_state, _ = tf.nn.raw_rnn(cell, loop_fn)
outputs = _transpose_batch_time(outputs_ta.stack())
final_state = last_state

return outputs, final_state

附带说明:尚不清楚在训练期间依赖模型的输出是否是个好主意。尤其是在开始时,模型的输出可能非常糟糕,因此您的训练可能永远不会收敛或者可能无法学到任何有意义的东西。

关于python - Tensorflow:如何将上一个时间步的输出作为输入传递到下一个时间步,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/39681026/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com