作者热门文章
- html - 出于某种原因,IE8 对我的 Sass 文件中继承的 html5 CSS 不友好?
- JMeter 在响应断言中使用 span 标签的问题
- html - 在 :hover and :active? 上具有不同效果的 CSS 动画
- html - 相对于居中的 html 内容固定的 CSS 重复背景?
我已经引用 this 编写了自己的代码精彩的教程,根据我在类 AttentionModel 中的理解,将注意力与波束搜索结合使用时,我无法获得结果,_build_decoder_cell 函数为推理模式创建单独的解码器单元和注意力包装器,假设这个(我认为这是不正确的,并且找不到绕过它),
with tf.name_scope("Decoder"):
mem_units = 2*dim
dec_cell = tf.contrib.rnn.BasicLSTMCell( 2*dim )
beam_cel = tf.contrib.rnn.BasicLSTMCell( 2*dim )
beam_width = 3
out_layer = Dense( output_vocab_size )
with tf.name_scope("Training"):
attn_mech = tf.contrib.seq2seq.BahdanauAttention( num_units = mem_units, memory = enc_rnn_out, normalize=True)
attn_cell = tf.contrib.seq2seq.AttentionWrapper( cell = dec_cell,attention_mechanism = attn_mech )
batch_size = tf.shape(enc_rnn_out)[0]
initial_state = attn_cell.zero_state( batch_size = batch_size , dtype=tf.float32 )
initial_state = initial_state.clone(cell_state = enc_rnn_state)
helper = tf.contrib.seq2seq.TrainingHelper( inputs = emb_x_y , sequence_length = seq_len )
decoder = tf.contrib.seq2seq.BasicDecoder( cell = attn_cell, helper = helper, initial_state = initial_state ,output_layer=out_layer )
outputs, final_state, final_sequence_lengths= tf.contrib.seq2seq.dynamic_decode(decoder=decoder,impute_finished=True)
training_logits = tf.identity(outputs.rnn_output )
training_pred = tf.identity(outputs.sample_id )
with tf.name_scope("Inference"):
enc_rnn_out_beam = tf.contrib.seq2seq.tile_batch( enc_rnn_out , beam_width )
seq_len_beam = tf.contrib.seq2seq.tile_batch( seq_len , beam_width )
enc_rnn_state_beam = tf.contrib.seq2seq.tile_batch( enc_rnn_state , beam_width )
batch_size_beam = tf.shape(enc_rnn_out_beam)[0] # now batch size is beam_width times
# start tokens mean be the original batch size so divide
start_tokens = tf.tile(tf.constant([27], dtype=tf.int32), [ batch_size_beam//beam_width ] )
end_token = 0
attn_mech_beam = tf.contrib.seq2seq.BahdanauAttention( num_units = mem_units, memory = enc_rnn_out_beam, normalize=True)
cell_beam = tf.contrib.seq2seq.AttentionWrapper(cell=beam_cel,attention_mechanism=attn_mech_beam,attention_layer_size=mem_units)
initial_state_beam = cell_beam.zero_state(batch_size=batch_size_beam,dtype=tf.float32).clone(cell_state=enc_rnn_state_beam)
my_decoder = tf.contrib.seq2seq.BeamSearchDecoder( cell = cell_beam,
embedding = emb_out,
start_tokens = start_tokens,
end_token = end_token,
initial_state = initial_state_beam,
beam_width = beam_width
,output_layer=out_layer)
beam_output, t1 , t2 = tf.contrib.seq2seq.dynamic_decode( my_decoder,
maximum_iterations=maxlen )
beam_logits = tf.no_op()
beam_sample_id = beam_output.predicted_ids
最佳答案
我不确定“我无法获得结果”是什么意思,但我假设您的模型没有利用训练时学到的知识。
如果是这种情况,那么首先您需要知道这一切都与变量共享有关,您需要做的第一件事是摆脱训练和推断之间的不同变量范围,而是需要使用一些像
去除那个
with tf.name_scope("Training"):
with tf.variable_scope("myScope"):
with tf.name_scope("Inference"):
with tf.variable_scope("myScope" , reuse=True):
with tf.variable_scope("myScope" )
enc_rnn_out = tf.contrib.seq2seq.tile_batch( enc_rnn_out , 1 )
seq_len = tf.contrib.seq2seq.tile_batch( seq_len , 1 )
enc_rnn_state = tf.contrib.seq2seq.tile_batch( enc_rnn_state , 1 )
关于tensorflow - 在 tensorflow 中使用波束搜索实现注意力,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/46021216/
查找图像的哪些部分对图像分类贡献最大的常用技术有哪些通过卷积神经网络 ? 一般来说,假设我们有 0 到 1 之间的浮点值作为整体的二维矩阵。每个矩阵都与一个标签(单标签、多类)相关联,目标是通过(Ke
我是一名优秀的程序员,十分优秀!