gpt4 book ai didi

python-3.x - 恢复后的 tensorflow 批量标准化

转载 作者:行者123 更新时间:2023-12-03 17:57:08 26 4
gpt4 key购买 nike

假设我们创建了一个小型网络:

tf.reset_default_graph()
layers = [5, 3, 1]
activations = [tf.tanh, tf.tanh, None]

inp = tf.placeholder(dtype=tf.float32, shape=(None, 2 ), name='inp')
out = tf.placeholder(dtype=tf.float32, shape=(None, 1 ), name='out')

isTraining = tf.placeholder(dtype=tf.bool, shape=(), name='isTraining')

N = inp * 1 # I am lazy
for i, (l, a) in enumerate(zip(layers, activations)):
N = tf.layers.dense(N, l, None)
#N = tf.layers.batch_normalization( N, training = isTraining) # comment this line
if a is not None:
N = a(N)

err = tf.reduce_mean((N - out)**2)
update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS)
with tf.control_dependencies(update_ops):
opt = tf.train.AdamOptimizer(0.05).minimize(err)

# insert vectors from the batch normalization
tVars = tf.trainable_variables()
graph = tf.get_default_graph()
for v in graph.get_collection(tf.GraphKeys.GLOBAL_VARIABLES):
if all([
('batch_normalization' in v.name),
('optimizer' not in v.name),
v not in tVars ]):
tVars.append(v)

init = tf.global_variables_initializer()
saver = tf.train.Saver(var_list= tVars)

这是为优化而生成的简单 NN。我目前唯一感兴趣的是批量优化(已被注释掉的行)。现在,我们训练这个网络,保存它,恢复它并再次计算误差,我们做的很好:
# Generate random data
N = 1000
X = np.random.rand(N, 2)
y = 2*X[:, 0] + 3*X[:, 1] + 3
y = y.reshape(-1, 1)

# Run the session and save it
with tf.Session() as sess:
sess.run(init)
print('During Training')
for i in range(3000):
_, errVal = sess.run([opt, err], feed_dict={inp:X, out:y, isTraining:True})
if i %500 == 0:
print(errVal)

shutil.rmtree('models1', ignore_errors=True)
os.makedirs('models1')
path = saver.save( sess, 'models1/model.ckpt' )

# restore the session
print('During testing')
with tf.Session() as sess:
saver.restore(sess, path)
errVal = sess.run(err, feed_dict={inp:X, out:y, isTraining:False})
print( errVal )

这是输出:
During Training
24.4422
0.00330666
0.000314223
0.000106421
6.00441e-05
4.95262e-05
During testing
INFO:tensorflow:Restoring parameters from models1/model.ckpt
5.5899e-05

另一方面,当我们取消批归一化行的注释时,重新进行上述计算:
During Training
31.7372
1.92066e-05
3.87879e-06
2.55274e-06
1.25418e-06
1.43078e-06
During testing
INFO:tensorflow:Restoring parameters from models1/model.ckpt
0.041519

如您所见,恢复的值与模型预测的相差甚远。有什么我做错了吗?

注意:我知道对于批量标准化,我需要生成小批量。我已经跳过了所有这些以保持代码简单而完整。

最佳答案

批量归一化层,如 Tensorflow 中所定义,需要访问占位符 isTraining (https://www.tensorflow.org/api_docs/python/tf/layers/batch_normalization)。确保在定义层时包含它:tf.layers.batch_normalization(..., training=isTraining, ...) .

原因是批量归一化层有 2 个可训练参数(beta 和 gamma),这些参数与网络的其余部分一起正常训练,但它们还有 2 个额外参数(批量均值和方差)需要您告诉他们训练.您只需应用上面的配方即可做到这一点。

现在您的代码似乎不是训练均值和方差。相反,它们是随机固定的,并且网络会根据这些进行优化。稍后,当您保存和恢复时,它们会使用不同的值重新初始化,因此网络不会像以前那样运行。

关于python-3.x - 恢复后的 tensorflow 批量标准化,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/51134326/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com