gpt4 book ai didi

tensorflow - 为什么 LSTMCell 输入的维度必须与单元数匹配

转载 作者:行者123 更新时间:2023-12-02 07:33:26 24 4
gpt4 key购买 nike

当我阅读实现 LSTMCell 的 TensorFlow 的 rnn_cell.py 时,我看到以下内容

  def __call__(self, inputs, state, scope=None):
"""Run one step of LSTM.

Args:
inputs: input Tensor, 2D, batch x num_units.
state: if `state_is_tuple` is False, this must be a state Tensor,
`2-D, batch x state_size`. If `state_is_tuple` is True, this must be a
tuple of state Tensors, both `2-D`, with column sizes `c_state` and
`m_state`.
scope: VariableScope for the created subgraph; defaults to "LSTMCell".

Returns:
A tuple containing:
- A `2-D, [batch x output_dim]`, Tensor representing the output of the
LSTM after reading `inputs` when previous state was `state`.
Here output_dim is:
num_proj if num_proj was set,
num_units otherwise.
- Tensor(s) representing the new state of LSTM after reading `inputs` when
the previous state was `state`. Same type and shape(s) as `state`.

Raises:
ValueError: If input size cannot be inferred from inputs via
static shape inference.
"""
num_proj = self._num_units if self._num_proj is None else self._num_proj

if self._state_is_tuple:
(c_prev, m_prev) = state
else:

我想知道为什么输入的维度必须与LSTM的单元数(num_units)相匹配。我本以为它们是完全无关的,但不知何故它们并非如此。

有谁知道为什么吗?

最佳答案

它不需要匹配单元格的单位数(即隐藏维度)。

首先:

num_proj: (optional) int, The output dimensionality for the projection matrices. If None, no projection is performed.

也就是说,num_proj是单元输出的维度,它不能与num_units(隐藏维度)的维度匹配。通常我们在解码时想要的输出与词汇表具有相同的维度(不是这里的隐藏维度或数字单位的维度)。

      if self._num_proj is not None:
with vs.variable_scope("projection") as proj_scope:
if self._num_proj_shards is not None:
proj_scope.set_partitioner(
partitioned_variables.fixed_size_partitioner(
self._num_proj_shards))
m = _linear(m, self._num_proj, bias=False)

正如您在上面所看到的,它只需通过_线性投影/变换将输出(m)转移到具有num_proj维度。如果 num_proj 为 None,则默认与隐藏维度相同。

def _linear(args, output_size, bias, bias_start=0.0, scope=None):
"""Linear map: sum_i(args[i] * W[i]), where W[i] is a variable.

Args:
args: a 2D Tensor or a list of 2D, batch x n, Tensors.
output_size: int, second dimension of W[i].
bias: boolean, whether to add a bias term or not.
bias_start: starting value to initialize the bias; 0 by default.
scope: VariableScope for the created subgraph; defaults to "Linear".

Returns:
A 2D Tensor with shape [batch x output_size] equal to
sum_i(args[i] * W[i]), where W[i]s are newly created matrices.

Raises:
ValueError: if some of the arguments has unspecified or wrong shape.
"""
if args is None or (isinstance(args, (list, tuple)) and not args):
raise ValueError("`args` must be specified")
if not isinstance(args, (list, tuple)):
args = [args]

# Calculate the total size of arguments on dimension 1.
total_arg_size = 0
shapes = [a.get_shape().as_list() for a in args]
for shape in shapes:
if len(shape) != 2:
raise ValueError("Linear is expecting 2D arguments: %s" % str(shapes))
if not shape[1]:
raise ValueError("Linear expects shape[1] of arguments: %s" % str(shapes))
else:
total_arg_size += shape[1]

# Now the computation.
with tf.variable_scope(scope or "Linear"):
matrix = tf.get_variable("Matrix", [total_arg_size, output_size])
if len(args) == 1:
res = tf.matmul(args[0], matrix)
else:
res = tf.matmul(tf.concat(axis=1, values=args), matrix)
if not bias:
return res
bias_term = tf.get_variable("Bias", [output_size], initializer=tf.constant_initializer(bias_start))
return res + bias_term

关于tensorflow - 为什么 LSTMCell 输入的维度必须与单元数匹配,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/42691762/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com