python - 为什么在 ptb_word_ln.py 中 embedding_lookup 只用作编码器而不用作解码器-6ren

python - 为什么在 ptb_word_ln.py 中 embedding_lookup 只用作编码器而不用作解码器

转载作者：太空宇宙更新时间：2023-11-04 02:35:30

24

4

在查看tensorflow的官方示例代码ptb_word_ln.py时，我有一个关于embedding_lookup的问题。 the embedding_lookup node

我发现它仅用作输入。输出不使用它。所以损失评估不能从这种嵌入中获益。那么在这里使用 embedding_lookup 有什么好处呢？如果我想在优化器中使用这个词嵌入，我不应该明确地将它与损失函数联系起来吗？

源码如下:

self._input = input_

batch_size = input_.batch_size
num_steps = input_.num_steps
size = config.hidden_size
vocab_size = config.vocab_size

def lstm_cell():
  # With the latest TensorFlow source code (as of Mar 27, 2017),
  # the BasicLSTMCell will need a reuse parameter which is unfortunately not
  # defined in TensorFlow 1.0. To maintain backwards compatibility, we add
  # an argument check here:
  if 'reuse' in inspect.getargspec(
      tf.contrib.rnn.BasicLSTMCell.__init__).args:
    return tf.contrib.rnn.BasicLSTMCell(
        size, forget_bias=0.0, state_is_tuple=True,
        reuse=tf.get_variable_scope().reuse)
  else:
    return tf.contrib.rnn.BasicLSTMCell(
        size, forget_bias=0.0, state_is_tuple=True)
attn_cell = lstm_cell
if is_training and config.keep_prob < 1:
  def attn_cell():
    return tf.contrib.rnn.DropoutWrapper(
        lstm_cell(), output_keep_prob=config.keep_prob)
cell = tf.contrib.rnn.MultiRNNCell(
    [attn_cell() for _ in range(config.num_layers)], state_is_tuple=True)

self._initial_state = cell.zero_state(batch_size, data_type())

with tf.device("/cpu:0"):
  embedding = tf.get_variable(
      "embedding", [vocab_size, size], dtype=data_type())
  inputs = tf.nn.embedding_lookup(embedding, input_.input_data)#only use embeddings here

if is_training and config.keep_prob < 1:
  inputs = tf.nn.dropout(inputs, config.keep_prob)

outputs = []
state = self._initial_state
with tf.variable_scope("RNN"):
  for time_step in range(num_steps):
    if time_step > 0: tf.get_variable_scope().reuse_variables()
    (cell_output, state) = cell(inputs[:, time_step, :], state)
    outputs.append(cell_output)

output = tf.reshape(tf.stack(axis=1, values=outputs), [-1, size])
softmax_w = tf.get_variable(
    "softmax_w", [size, vocab_size], dtype=data_type())
softmax_b = tf.get_variable("softmax_b", [vocab_size], dtype=data_type())
logits = tf.matmul(output, softmax_w) + softmax_b
loss = tf.contrib.legacy_seq2seq.sequence_loss_by_example(
    [logits],
    [tf.reshape(input_.targets, [-1])],
    [tf.ones([batch_size * num_steps], dtype=data_type())])
self._cost = cost = tf.reduce_sum(loss) / batch_size
self._final_state = state

if not is_training:
  return

self._lr = tf.Variable(0.0, trainable=False)
tvars = tf.trainable_variables()
grads, _ = tf.clip_by_global_norm(tf.gradients(cost, tvars),
                                  config.max_grad_norm)
optimizer = tf.train.GradientDescentOptimizer(self._lr)
self._train_op = optimizer.apply_gradients(
    zip(grads, tvars),
    global_step=tf.contrib.framework.get_or_create_global_step())

self._new_lr = tf.placeholder(
    tf.float32, shape=[], name="new_learning_rate")
self._lr_update = tf.assign(self._lr, self._new_lr)

最佳答案

实际上输出确实使用嵌入查找。TensorFlow 程序通常分为构建阶段和执行阶段，前者组装图，后者使用 session 在图中执行操作。

在您的情况下，为了计算损失，您必须按此顺序计算图中的以下节点:

loss -> logits -> output -> outputs -> cell -> inputs -> embedding_lookup另一种看待它的方法是，如果这些是嵌套函数调用:损失(logits(输出(输出(单元格输出(单元格(输入(嵌入_查找(嵌入))))))))我从每个函数 (op) 中发出了额外的参数以使其更加清晰。

关于python - 为什么在 ptb_word_ln.py 中 embedding_lookup 只用作编码器而不用作解码器，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/48038889/

24

4

0

文章推荐： python - 如何使用不断变化的值对数据帧进行一致的热编码？

文章推荐： node.js - Mongoose 引用文档的 TTL 过期

文章推荐： python - time.sleep() 和 BackGround Windows PyQt5

Tensorflow embedding_lookup 梯度在CPU上注册？
我有一些相当于稀疏softmax的东西: ... with tf.device('/gpu:0'): indices = tf.placeholder(tf.int32, [None, dim
python - Tensorflow embedding_lookup
我正在尝试通过 TensorFlow tf.nn.embedding_lookup() 函数“从头开始”学习 imdb 数据集的单词表示。如果我理解正确的话，我必须在另一个隐藏层之前设置一个嵌入层，然
tensorflow - tensorflow embedding_lookup 是否可微分？
我遇到的一些教程，使用随机初始化的嵌入矩阵进行描述，然后使用 tf.nn.embedding_lookup 函数获取整数序列的嵌入。我的印象是，由于 embedding_matrix 是通过 tf.g
tensorflow - 为什么 embedding_lookup 比使用线性变换的一种热编码更好？
我想我在这里遗漏了一些明显的东西，但希望得到一些帮助来解决这个问题。假设我有一百万个单词，并希望将它们作为模型的一部分嵌入。使用 TF，我可以进行嵌入查找，但我需要提供大小为 [1m*space_
python - tf.nn.embedding_lookup 函数有什么作用？
tf.nn.embedding_lookup(params, ids, partition_strategy='mod', name=None) 我无法理解这个函数的职责。它像查找表吗？即返回每个id
tensorflow - 为什么 tf.nn.embedding_lookup 使用嵌入列表？
我想知道为什么 tf.nn.embedding_lookup 使用张量列表，而 tf.gather 只对单个张量执行查找。为什么我需要对多个嵌入进行查找？我想我在某处读到它对于在大型嵌入上节省内存很
machine-learning - tf.nn.embedding_lookup 带有浮点输入？
我想实现一个带有浮点输入而不是 int32 或 64b 的嵌入表。原因是我不想使用简单 RNN 中的单词，而是使用百分比。例如，如果是食谱；我可能有1000或3000种原料；但在每个食谱中我可能最多有
python - 如何修改 tf.nn.embedding_lookup() 的返回张量？
我想使用scatter_nd_update来更改从tf.nn.embedding_lookup()返回的张量的内容。但是，返回的张量不可变，并且 scatter_nd_update() 需要可变张量作
python - TensorFlow 在执行 embedding_lookup 时内存不足 (ResourceExhaustedError)
我正在使用预置向量来创建这样的嵌入 import numpy import gensim import tensorflow ft_model=gensim.models.KeyedVectors.l
python - 为什么在 ptb_word_ln.py 中 embedding_lookup 只用作编码器而不用作解码器
在查看tensorflow的官方示例代码ptb_word_ln.py时，我有一个关于embedding_lookup的问题。 the embedding_lookup node 我发现它仅用作输入。输
tensorflow - 加载预训练的 word2vec 以初始化 Estimator model_fn 中的 embedding_lookup
我正在解决一个文本分类问题。我使用 Estimator 定义了我的分类器我自己的类(class)model_fn .我想用谷歌预训练的word2vec嵌入为初始值，然后针对手头的任务进一步优化它。我
python - 使用 word2vec 预训练向量，如何生成句子的 id 作为 tensorflow 中 tf.nn.embedding_lookup 函数的输入？
要提取输入数据的嵌入表示，tensorflow 文档说我们可以使用以下内容: embed = tf.nn.embedding_lookup(embeddings, input_data) 加入 TF
python-3.x - 为 tf.nn.embedding_lookup 预处理不同文本大小时 Pre-Padding 和 Post-Padding 文本的差异
在馈入嵌入层时，我看到了两种类型的填充。 eg: considering two sentences: word1 = "I am a dog person." word2 = "Krishni an

首页

博学

6Ren·AI

商城

python - 为什么在 ptb_word_ln.py 中 embedding_lookup 只用作编码器而不用作解码器