gpt4 book ai didi

tensorflow - 使用带有 DQN 算法的张量板

转载 作者:行者123 更新时间:2023-12-02 02:42:06 26 4
gpt4 key购买 nike

对于强化学习,我读到张量板并不理想,因为它提供了每集和/或步骤的输入。由于强化学习有数千个步骤,因此它并没有给我们内容的概述。我在这里看到了这个修改后的张量板类:https://pythonprogramming.net/deep-q-learning-dqn-reinforcement-learning-python-tutorial/

类(class):

class ModifiedTensorBoard(TensorBoard):
# Overriding init to set initial step and writer (we want one log file for all .fit() calls)
def __init__(self, name, **kwargs):
super().__init__(**kwargs)
self.step = 1
self.writer = tf.summary.create_file_writer(self.log_dir)
self._log_write_dir = os.path.join(self.log_dir, name)

# Overriding this method to stop creating default log writer
def set_model(self, model):
pass

# Overrided, saves logs with our step number
# (otherwise every .fit() will start writing from 0th step)
def on_epoch_end(self, epoch, logs=None):
self.update_stats(**logs)

# Overrided
# We train for one batch only, no need to save anything at epoch end
def on_batch_end(self, batch, logs=None):
pass

# Overrided, so won't close writer
def on_train_end(self, _):
pass

def on_train_batch_end(self, batch, logs=None):
pass

# Custom method for saving own metrics
# Creates writer, writes custom metrics and closes writer
def update_stats(self, **stats):
self._write_logs(stats, self.step)

def _write_logs(self, logs, index):
with self.writer.as_default():
for name, value in logs.items():
tf.summary.scalar(name, value, step=index)
self.step += 1
self.writer.flush()

我想让它与这一层一起工作:

n_actions = env.action_space.n
input_dim = env.observation_space.n
model = tf.keras.Sequential()
model.add(tf.keras.layers.Dense(20, input_dim = input_dim , activation = 'relu'))#32
model.add(tf.keras.layers.Dense(10, activation = 'relu'))#10
model.add(tf.keras.layers.Dense(n_actions, activation = 'linear'))
model.compile(optimizer=tf.keras.optimizers.Adam(), loss = 'mse')

但我还没有让它发挥作用。以前使用过 Tensorboard 的人知道如何设置吗?非常感谢任何见解。

最佳答案

我在 RL 算法训练期间始终使用张量板,没有像上面那样修改任何代码。只需启动您的作家:

writer = tf.summary.create_file_writer(logdir=log_folder)

开始你的代码:

with writer.as_default():
... do everythng indented inside here

例如如果你想每 100 步将奖励或第一层的权重保存到张量板,只需执行以下操作:

if step % 100 = 0:
tf.summary.scalar(name="reward", data=reward, step=step)
dqn_variable = model.trainable_variables
tf.summary.histogram(name="dqn_variables", data=tf.convert_to_tensor(dqn_variable[0]), step=step)
writer.flush()

这应该可以解决问题:)

关于tensorflow - 使用带有 DQN 算法的张量板,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/63408505/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com