gpt4 book ai didi

python - 将初始状态输入 LSTMCell

转载 作者:行者123 更新时间:2023-11-30 09:48:11 25 4
gpt4 key购买 nike

我在这里引用代码https://github.com/martin-gorner/tensorflow-rnn-shakespeare/blob/master/rnn_train.py我正在尝试将单元格从 GRUCell 转换为 LSTMCell。这是代码的摘录。

# input state
Hin = tf.placeholder(tf.float32, [None, INTERNALSIZE * NLAYERS], name='Hin') # [ BATCHSIZE, INTERNALSIZE * NLAYERS]

# using a NLAYERS=3 layers of GRU cells, unrolled SEQLEN=30 times
# dynamic_rnn infers SEQLEN from the size of the inputs Xo

# How to properly apply dropout in RNNs: see README.md
cells = [rnn.GRUCell(INTERNALSIZE) for _ in range(NLAYERS)]

# "naive dropout" implementation
dropcells = [rnn.DropoutWrapper(cell, input_keep_prob=pkeep) for cell in cells]
multicell = rnn.MultiRNNCell(dropcells, state_is_tuple=False)
multicell = rnn.DropoutWrapper(multicell, output_keep_prob=pkeep) # dropout for the softmax layer

Yr, H = tf.nn.dynamic_rnn(multicell, Xo, dtype=tf.float32, initial_state=Hin)
# Yr: [ BATCHSIZE, SEQLEN, INTERNALSIZE ]
# H: [ BATCHSIZE, INTERNALSIZE*NLAYERS ] # this is the last state in the sequence

H = tf.identity(H, name='H') # just to give it a name

我知道 LSTMCell 有两个状态,单元状态 C 和输出状态 H。我想要做的是将两个状态的元组提供给initial_state。我怎样才能以正确的方式做到这一点?我尝试了各种方法,但总是遇到 tensorflow 错误。

编辑:这是尝试之一:

# inputs
X = tf.placeholder(tf.uint8, [None, None], name='X') # [ BATCHSIZE, SEQLEN ]
Xo = tf.one_hot(X, ALPHASIZE, 1.0, 0.0) # [ BATCHSIZE, SEQLEN, ALPHASIZE ]
# expected outputs = same sequence shifted by 1 since we are trying to predict the next character
Y_ = tf.placeholder(tf.uint8, [None, None], name='Y_') # [ BATCHSIZE, SEQLEN ]
Yo_ = tf.one_hot(Y_, ALPHASIZE, 1.0, 0.0) # [ BATCHSIZE, SEQLEN, ALPHASIZE ]
# input state
Hin = tf.placeholder(tf.float32, [None, INTERNALSIZE * NLAYERS], name='Hin') # [ BATCHSIZE, INTERNALSIZE * NLAYERS]
Cin = tf.placeholder(tf.float32, [None, INTERNALSIZE * NLAYERS], name='Cin')
initial_state = tf.nn.rnn_cell.LSTMStateTuple(Cin, Hin)
# using a NLAYERS=3 layers of GRU cells, unrolled SEQLEN=30 times
# dynamic_rnn infers SEQLEN from the size of the inputs Xo

# How to properly apply dropout in RNNs: see README.md
cells = [rnn.LSTMCell(INTERNALSIZE) for _ in range(NLAYERS)]

# "naive dropout" implementation
dropcells = [rnn.DropoutWrapper(cell, input_keep_prob=pkeep) for cell in cells]
multicell = rnn.MultiRNNCell(dropcells, state_is_tuple=True)
multicell = rnn.DropoutWrapper(multicell, output_keep_prob=pkeep) # dropout for the softmax layer

Yr, H = tf.nn.dynamic_rnn(multicell, Xo, dtype=tf.float32, initial_state=initial_state)

它说“TypeError:‘Tensor’对象不可迭代。”

谢谢。

最佳答案

发生错误的原因是,在构建图表时,您必须单独为每一层提供一个元组(占位符),然后在训练时,您必须提供以下状态:第一层。

错误是说:我需要迭代(c和m)的元组列表,因为你有多个单元格,我需要初始化它们的所有状态,但我看到的都是是一个张量,我无法迭代它。

此代码片段展示了如何在构建图表时设置占位符:

state_size = 10
num_layers = 3

X = tf.placeholder(tf.float32, [None, 100, 10])

# the second dimension is size 2 and represents
# c, m ( the cell and hidden state )
# set the batch_size to None
state_placeholder = tf.placeholder(tf.float32, [num_layers, 2,
None, state_size])
# l is number of layers placeholders
l = tf.unstack(state_placeholder, axis=0)

then we create a tuple of LSTMStateTuple for each layer
rnn_tuple_state = tuple(
[rnn.LSTMStateTuple(l[idx][0],l[idx][1])
for idx in range(num_layers)]
)

# I had to set resuse = True here : tf.__version__ 1.7.0
cells = [rnn.LSTMCell(10, reuse=True)] * num_layers
mc = rnn.MultiRNNCell(cells, state_is_tuple=True)

outputs, state = tf.nn.dynamic_rnn(cell=mc,
inputs=X,
initial_state=rnn_tuple_state,
dtype=tf.float32)

这是 docs 中的相关位:

initial_state: (optional) An initial state for the RNN. If cell.state_size is an integer, this must be a Tensor of appropriate type and shape [batch_size, cell.state_size].

因此,我们最终为每个单元格(层)创建了具有所需大小的占位符元组。 (batch_size, state_size) 其中batch_size = None。我对此进行了阐述answer

关于python - 将初始状态输入 LSTMCell,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/49603600/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com