gpt4 book ai didi

python - Tensorflow:如何将 conv 层权重复制到另一个变量以用于强化学习?

转载 作者:行者123 更新时间:2023-11-28 20:57:00 26 4
gpt4 key购买 nike

我不确定这在 Tensorflow 中是否可行,我担心我可能不得不切换到 PyTorch。

基本上,我有这一层:

self.policy_conv1 = tf.layers.conv2d(inputs=self.policy_s, filters=16, kernel_size=(8,8),strides=(4,4), padding = 'valid',activation=tf.nn.relu, kernel_initializer=tf.glorot_uniform_initializer, bias_initializer = tf.glorot_uniform_initializer)

我试图每 100 次左右的训练迭代将其复制到另一层:

self.eval_conv1 = tf.layers.conv2d(inputs=self.s, filters=16, kernel_size=(8,8),strides=(4,4), padding = 'valid', activation=tf.nn.relu, kernel_initializer=tf.glorot_uniform_initializer, bias_initializer = tf.glorot_uniform_initializer)

tf.assign 似乎不是正确的工具,以下似乎不起作用:

self.policy_conv1 = tf.stop_gradient(tf.identity(self.eval_conv1))

本质上,我希望将 eval conv 层复制到 policy conv 层,而不是每次图形运行一个变量或另一个变量时都将它们捆绑在一起(这与上面的标识片段一起发生)。如果有人能指出我需要的代码,我将不胜感激。

最佳答案

import numpy as np
import tensorflow as tf

# I'm using placeholders, but it'll work for other inputs as well
ph1 = tf.placeholder(tf.float32, [None, 32, 32, 3])
ph2 = tf.placeholder(tf.float32, [None, 32, 32, 3])

l1 = tf.layers.conv2d(inputs=ph1, filters=16, kernel_size=(8,8),strides=(4,4), padding = 'valid',activation=tf.nn.relu, kernel_initializer=tf.glorot_uniform_initializer, bias_initializer = tf.glorot_uniform_initializer, name="layer_1")
l2 = tf.layers.conv2d(inputs=ph2, filters=16, kernel_size=(8,8),strides=(4,4), padding = 'valid',activation=tf.nn.relu, kernel_initializer=tf.glorot_uniform_initializer, bias_initializer = tf.glorot_uniform_initializer, name="layer_2")

sess = tf.Session()
sess.run(tf.global_variables_initializer())

w1 = tf.get_default_graph().get_tensor_by_name("layer_1/kernel:0")
w2 = tf.get_default_graph().get_tensor_by_name("layer_2/kernel:0")

w1_r = sess.run(w1)
w2_r = sess.run(w2)
print(np.sum(w1_r - w2_r)) # non-zero

sess.run(tf.assign(w2, w1))
w1_r = sess.run(w1)
w2_r = sess.run(w2)
print(np.sum(w1_r - w2_r)) # 0

w1 = w1 * 2 + 1
w1_r = sess.run(w1)
w2_r = sess.run(w2)
print(np.sum(w1_r - w2_r)) # non-zero

layer_1/bias:0 应该可以获取偏置项。

更新:

我找到了一个更简单的方法:

update_weights = [tf.assign(new, old) for (new, old) in 
zip(tf.trainable_variables('new_scope'), tf.trainable_vars('old_scope'))]

update_weights 上执行 sess.run 应该将权重从一个网络复制到另一个网络。请记住在单独的名称范围下构建它们。

关于python - Tensorflow:如何将 conv 层权重复制到另一个变量以用于强化学习?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/53965005/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com