gpt4 book ai didi

tensorflow - 单 tensorflow sess.run() 中的多步梯度下降

转载 作者:行者123 更新时间:2023-12-03 13:12:47 26 4
gpt4 key购买 nike

我想对单个 sess.run() 调用执行多个梯度下降步骤。每次调用的输入都是固定的,因此我只需要传递一次。

我该怎么做?我有一个想法,但我不确定它是否会在每一步重新计算梯度(而是应用第一个梯度 N 次)。我想避免多次调用 tf.gradients()。在依赖项中包含 grads_and_vars 就足够了吗?

N=5
fit_op_i = fit_op_0 = optimizer.apply_gradients(grads_and_vars)
for i in range(N):
with tf.control_dependencies([fit_op_i]):
fit_op_i = optimizer.apply_gradients(grads_and_vars)
fit_op_N = fit_op_i

一个相关问题的答案需要多次 sess.run() 调用: Run train op multiple times in tensorflow

最佳答案

要实现这一点,我们只需定义一系列具有指定操作之间依赖关系的唯一前向-反向传播过程,然后将它们tf.group在一起[1]来执行在单个 session 运行中。

我的示例定义了一个感知器层,用于拟合 50 个二维高斯 Blob 。该代码在 tensorboard 中生成以下图表: enter image description here

为了测试正确性,我用相同的初始化值训练了两次。第一次使用单个 forward-backprop 步骤,第二次使用 3 个步骤组合为一个操作:

init = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
for i in range(12):
loss_val = loss_op.eval(feed_dict={x:x_train, y:y_train})
print(i, '-->', "{0:.3f}".format(loss_val))
_ = sess.run(train_op, feed_dict={x:x_train, y:y_train})
# loss_val = loss_op.eval(feed_dict={x:x_train, y:y_train})
# print(i, '-->', "{0:.3f}".format(loss_val))
# _ = sess.run(applied_grads, feed_dict={x:x_train, y:y_train})
# 3-steps # 1-step
# 0 --> 0.693 # 0 --> 0.693 ---
# 1 --> 0.665 # 1 --> 0.683
# 2 --> 0.638 # 2 --> 0.674
# 3 --> 0.613 # 3 --> 0.665 ---
# 4 --> 0.589 # 4 --> 0.656
# 5 --> 0.567 # 5 --> 0.647
# 6 --> 0.547 # 6 --> 0.638 ---
# 7 --> 0.527 # 7 --> 0.630
# 8 --> 0.509 # 8 --> 0.622
# 9 --> 0.492 # 9 --> 0.613 ---
# ...

它显然对应于 3 个步骤。完整示例:

from sklearn.datasets import make_blobs
import tensorflow as tf
import numpy as np
tf.reset_default_graph()

times_to_apply = 3 # number of steps to perform

with tf.name_scope('x'):
x = tf.placeholder(tf.float32, shape=(None, 2))
with tf.name_scope('y'):
y = tf.placeholder(tf.int32, shape=(50))

logits = tf.layers.dense(inputs=x,
units=2,
name='NN',
kernel_initializer=tf.initializers.ones,
bias_initializer=tf.initializers.zeros)

optimizer = tf.train.GradientDescentOptimizer(0.01)


with tf.name_scope('loss-step-1'):
xentropy = tf.nn.sparse_softmax_cross_entropy_with_logits(labels=y, logits=logits)
loss_op = tf.reduce_mean(xentropy)

with tf.name_scope('optimizer-step-1'):
grads_and_vars = optimizer.compute_gradients(loss_op)
applied_grads = optimizer.apply_gradients(grads_and_vars)

all_grads_and_vars = [grads_and_vars]
all_applied_grads = [applied_grads]
all_loss_ops = [loss_op]

for i in range(times_to_apply - 1):
with tf.control_dependencies([all_applied_grads[-1]]):
with tf.name_scope('loss-step-' + str(i + 2)):
xentropy = tf.nn.sparse_softmax_cross_entropy_with_logits(labels=y, logits=logits)
all_loss_ops.append(tf.reduce_mean(xentropy))
with tf.control_dependencies([all_loss_ops[-1]]):
with tf.name_scope('optimizer-step-' + str(i + 2)):
all_grads_and_vars.append(optimizer.compute_gradients(all_loss_ops[-1]))
all_applied_grads.append(optimizer.apply_gradients(all_grads_and_vars[-1]))

train_op = tf.group(all_applied_grads)

[1] @eqzx 完全正确。没有必要将操作分组在一起。为了达到相同的效果,我们可以只执行具有明确定义的依赖项的最终优化器步骤。

关于tensorflow - 单 tensorflow sess.run() 中的多步梯度下降,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/45805977/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com