gpt4 book ai didi

python - 使用keras计算梯度范数wrt权重

转载 作者:太空宇宙 更新时间:2023-11-03 13:31:56 25 4
gpt4 key购买 nike

我正在尝试使用 keras(作为诊断工具)计算关于神经网络权重的梯度范数。最终,我想为此创建一个回调,但在此过程中,我一直致力于创建一个函数,该函数可以计算梯度并以 numpy 数组/标量值(而不仅仅是 tensorflow )的形式返回实际值张量)。代码如下:

import numpy as np
import keras.backend as K
from keras.layers import Dense
from keras.models import Sequential


def get_gradient_norm_func(model):
grads = K.gradients(model.total_loss, model.trainable_weights)
summed_squares = [K.sum(K.square(g)) for g in grads]
norm = K.sqrt(sum(summed_squares))
func = K.function([model.input], [norm])
return func


def main():
x = np.random.random((128,)).reshape((-1, 1))
y = 2 * x
model = Sequential(layers=[Dense(2, input_shape=(1,)),
Dense(1)])
model.compile(loss='mse', optimizer='RMSprop')
get_gradient = get_gradient_norm_func(model)
history = model.fit(x, y, epochs=1)
print(get_gradient([x]))

if __name__ == '__main__':
main()

代码在调用 get_gradient() 时失败。追溯很长,涉及很多关于形状的信息,但关于什么是正确形状的信息很少。我该如何纠正这个问题?

理想情况下,我想要一个与后端无关的解决方案,但基于 tensorflow 的解决方案也是一个选项。

2017-08-15 15:39:14.914388: W tensorflow/core/framework/op_kernel.cc:1148] Invalid argument: Shape [-1,-1] has negative dimensions
2017-08-15 15:39:14.914414: E tensorflow/core/common_runtime/executor.cc:644] Executor failed to create kernel. Invalid argument: Shape [-1,-1] has negative dimensions
[[Node: dense_2_target = Placeholder[dtype=DT_FLOAT, shape=[?,?], _device="/job:localhost/replica:0/task:0/cpu:0"]()]]
2017-08-15 15:39:14.915026: W tensorflow/core/framework/op_kernel.cc:1148] Invalid argument: Shape [-1,-1] has negative dimensions
2017-08-15 15:39:14.915038: E tensorflow/core/common_runtime/executor.cc:644] Executor failed to create kernel. Invalid argument: Shape [-1,-1] has negative dimensions
[[Node: dense_2_target = Placeholder[dtype=DT_FLOAT, shape=[?,?], _device="/job:localhost/replica:0/task:0/cpu:0"]()]]
2017-08-15 15:39:14.915310: W tensorflow/core/framework/op_kernel.cc:1148] Invalid argument: Shape [-1] has negative dimensions
2017-08-15 15:39:14.915321: E tensorflow/core/common_runtime/executor.cc:644] Executor failed to create kernel. Invalid argument: Shape [-1] has negative dimensions
[[Node: dense_2_sample_weights = Placeholder[dtype=DT_FLOAT, shape=[?], _device="/job:localhost/replica:0/task:0/cpu:0"]()]]
Traceback (most recent call last):
File "/home/josteb/.local/opt/anaconda3/envs/timeseries/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1139, in _do_call
return fn(*args)
File "/home/josteb/.local/opt/anaconda3/envs/timeseries/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1121, in _run_fn
status, run_metadata)
File "/home/josteb/.local/opt/anaconda3/envs/timeseries/lib/python3.6/contextlib.py", line 89, in __exit__
next(self.gen)
File "/home/josteb/.local/opt/anaconda3/envs/timeseries/lib/python3.6/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status
pywrap_tensorflow.TF_GetCode(status))
tensorflow.python.framework.errors_impl.InvalidArgumentError: Shape [-1] has negative dimensions
[[Node: dense_2_sample_weights = Placeholder[dtype=DT_FLOAT, shape=[?], _device="/job:localhost/replica:0/task:0/cpu:0"]()]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "gradientlog.py", line 45, in <module>
main()
File "gradientlog.py", line 42, in main
print(get_gradient([x]))
File "/home/josteb/sandbox/keras/keras/backend/tensorflow_backend.py", line 2251, in __call__
**self.session_kwargs)
File "/home/josteb/.local/opt/anaconda3/envs/timeseries/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 789, in run
run_metadata_ptr)
File "/home/josteb/.local/opt/anaconda3/envs/timeseries/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 997, in _run
feed_dict_string, options, run_metadata)
File "/home/josteb/.local/opt/anaconda3/envs/timeseries/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1132, in _do_run
target_list, options, run_metadata)
File "/home/josteb/.local/opt/anaconda3/envs/timeseries/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1152, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Shape [-1] has negative dimensions
[[Node: dense_2_sample_weights = Placeholder[dtype=DT_FLOAT, shape=[?], _device="/job:localhost/replica:0/task:0/cpu:0"]()]]

Caused by op 'dense_2_sample_weights', defined at:
File "gradientlog.py", line 45, in <module>
main()
File "gradientlog.py", line 39, in main
model.compile(loss='mse', optimizer='RMSprop')
File "/home/josteb/sandbox/keras/keras/models.py", line 783, in compile
**kwargs)
File "/home/josteb/sandbox/keras/keras/engine/training.py", line 799, in compile
name=name + '_sample_weights'))
File "/home/josteb/sandbox/keras/keras/backend/tensorflow_backend.py", line 435, in placeholder
x = tf.placeholder(dtype, shape=shape, name=name)
File "/home/josteb/.local/opt/anaconda3/envs/timeseries/lib/python3.6/site-packages/tensorflow/python/ops/array_ops.py", line 1530, in placeholder
return gen_array_ops._placeholder(dtype=dtype, shape=shape, name=name)
File "/home/josteb/.local/opt/anaconda3/envs/timeseries/lib/python3.6/site-packages/tensorflow/python/ops/gen_array_ops.py", line 1954, in _placeholder
name=name)
File "/home/josteb/.local/opt/anaconda3/envs/timeseries/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 767, in apply_op
op_def=op_def)
File "/home/josteb/.local/opt/anaconda3/envs/timeseries/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 2506, in create_op
original_op=self._default_original_op, op_def=op_def)
File "/home/josteb/.local/opt/anaconda3/envs/timeseries/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1269, in __init__
self._traceback = _extract_stack()

InvalidArgumentError (see above for traceback): Shape [-1] has negative dimensions
[[Node: dense_2_sample_weights = Placeholder[dtype=DT_FLOAT, shape=[?], _device="/job:localhost/replica:0/task:0/cpu:0"]()]]

最佳答案

Keras中有几个与梯度计算过程相关的占位符:

  1. 输入x
  2. 目标 y
  3. 样本权重:即使您没有在 model.fit() 中提供它,Keras 仍然会为样本权重生成一个占位符,并输入 np.ones((y.shape [0],), dtype=K.floatx()) 在训练过程中进入图表。
  4. 学习阶段:仅当有任何层使用它时(例如 Dropout),此占位符才会连接到梯度张量。

因此,在您提供的示例中,为了计算梯度,您需要将 xysample_weights 馈送到图中.这是错误的根本原因。

Model._make_train_function() 里面有 the following lines显示如何在这种情况下构建 K.function() 的必要输入:

inputs = self._feed_inputs + self._feed_targets + self._feed_sample_weights
if self.uses_learning_phase and not isinstance(K.learning_phase(), int):
inputs += [K.learning_phase()]

with K.name_scope('training'):
...
self.train_function = K.function(inputs,
[self.total_loss] + self.metrics_tensors,
updates=updates,
name='train_function',
**self._function_kwargs)

通过模仿这个函数,你应该能够得到范数值:

def get_gradient_norm_func(model):
grads = K.gradients(model.total_loss, model.trainable_weights)
summed_squares = [K.sum(K.square(g)) for g in grads]
norm = K.sqrt(sum(summed_squares))
inputs = model.model._feed_inputs + model.model._feed_targets + model.model._feed_sample_weights
func = K.function(inputs, [norm])
return func

def main():
x = np.random.random((128,)).reshape((-1, 1))
y = 2 * x
model = Sequential(layers=[Dense(2, input_shape=(1,)),
Dense(1)])
model.compile(loss='mse', optimizer='rmsprop')
get_gradient = get_gradient_norm_func(model)
history = model.fit(x, y, epochs=1)
print(get_gradient([x, y, np.ones(len(y))]))

执行输出:

Epoch 1/1
128/128 [==============================] - 0s - loss: 2.0073
[4.4091368]

请注意,由于您使用的是 Sequential 而不是 Model,因此需要 model.model._feed_* 而不是 model ._feed_*.

关于python - 使用keras计算梯度范数wrt权重,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/45694344/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com