gpt4 book ai didi

machine-learning - 我的代码在 tensorflow 中使用批量归一化层是否正确?

转载 作者:行者123 更新时间:2023-11-30 09:18:55 26 4
gpt4 key购买 nike

我有两个输入:具有相同形状的 qi_pos 和 qi_neg。它们应该经过两个 mlp 层处理,最终得到两个结果作为分数。这是我的代码:

  self.mlp1_pos  =    nn_layers.full_connect_(qi_pos,        256, activation='relu', use_bn = None, keep_prob=self.keep_prob,  name = 'deep_mlp_1')
self.mlp2_pos = nn_layers.full_connect_(self.mlp1_pos, 128, activation='relu', use_bn = True, keep_prob=self.keep_prob, name = 'deep_mlp_2')
self.pos_pair_sim = nn_layers.full_connect_(self.mlp2_pos, 1, activation=None, use_bn = True, keep_prob=self.keep_prob, name = 'deep_mlp_3')
tf.get_variable_scope().reuse_variables()
self.mlp1_neg = nn_layers.full_connect_(qi_neg, 256, activation='relu', use_bn = None, keep_prob=self.keep_prob, name = 'deep_mlp_1')
self.mlp2_neg = nn_layers.full_connect_(self.mlp1_neg, 128, activation='relu', use_bn = True, keep_prob=self.keep_prob, name = 'deep_mlp_2')
self.neg_pair_sim = nn_layers.full_connect_(self.mlp2_neg, 1, activation=None, use_bn = True, keep_prob=self.keep_prob, name = 'deep_mlp_3')

我使用BN层来标准化隐藏层中的节点:

def full_connect_(inputs, num_units, activation=None, use_bn = None, keep_prob = 1.0, name='full_connect_'):
with tf.variable_scope(name):
shape = [inputs.get_shape()[-1], num_units]
weight = weight_variable(shape)
bias = bias_variable(shape[-1])
outputs_ = tf.matmul(inputs, weight) + bias
if use_bn:
outputs_ = tf.contrib.layers.batch_norm(outputs_, center=True, scale=True, is_training=True,decay=0.9,epsilon=1e-5, scope='bn')
if activation=="relu":
outputs = tf.nn.relu(outputs_)
elif activation == "tanh":
outputs = tf.tanh(outputs_)
elif activation == "sigmoid":
outputs = tf.nn.sigmoid(outputs_)
else:
outputs = outputs_
return outputs

with tf.name_scope('predictions'):
self.sim_diff = self.pos_pair_sim - self.neg_pair_sim # shape = (batch_size, 1)
self.preds = tf.sigmoid(self.sim_diff) # shape = (batch_size, 1)
self.infers = self.pos_pair_sim

下面是loss的定义,看起来还可以。

with tf.name_scope('predictions'):
sim_diff = pos_pair_sim - neg_pair_sim
predictions = tf.sigmoid(sim_diff)
self.infers = pos_pair_sim
## loss and optim
with tf.name_scope('loss'):
self.loss = nn_layers.cross_entropy_loss_with_reg(self.labels, self.preds)
tf.summary.scalar('loss', self.loss)

我不确定我是否以正确的方式使用了 BN 层。我的意思是,BN 参数源自两个独立部分的隐藏单元,这两个部分基于 qi_pos 和 qi_neg 张量作为输入。无论如何,有人可以帮忙检查一下吗?

最佳答案

你的代码对我来说似乎很好,在网络的不同分支中应用BN没有问题。但我想在这里提几点注意事项:

  • BN 超参数非常标准,因此我通常不会手动设置 decayepsilonrenorm_decay。这并不意味着您不能更改它们,在大多数情况下根本没有必要。

  • 您在激活函数之前应用BN,但是,有证据表明如果在激活之后应用BN,效果会更好。例如,参见 this discussion 。再说一次,这并不意味着这是一个错误,只是多了一个需要考虑的架构。

关于machine-learning - 我的代码在 tensorflow 中使用批量归一化层是否正确?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/47171249/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com