gpt4 book ai didi

keras - 为什么要将变量添加到层的 _trainable_weights 列表中?

转载 作者:行者123 更新时间:2023-12-04 04:23:25 27 4
gpt4 key购买 nike

在这个笔记本中https://nbviewer.jupyter.org/github/krasserm/bayesian-machine-learning/blob/master/bayesian_neural_networks.ipynb ,作者定义了函数

def mixture_prior_params(sigma_1, sigma_2, pi):
params = K.variable([sigma_1, sigma_2, pi], name='mixture_prior_params')
sigma = np.sqrt(pi * sigma_1 ** 2 + (1 - pi) * sigma_2 ** 2)
return params, sigma

创建一个变量并返回一个元组。然后调用此方法

prior_params, prior_sigma = mixture_prior_params(sigma_1=1.0, sigma_2=0.1, pi=0.2)

然后,在类DenseVariational,也就是自定义层,在方法build中,将prior_params全局变量添加到私有(private)列表_trainable_weights

def build(self, input_shape):
self._trainable_weights.append(prior_params)
...

为什么需要或想要这样做?例如,如果我尝试打印自定义层或由该自定义层构成的模型的可训练参数

# Create the model with DenseVariational layers
model = Model(x_in, x_out)
print("model.trainable_weights =", model.trainable_weights)

我可以看到每个 DenseVariational 层都包含一个 mixture_prior_params 可训练参数。为什么要在层外声明 mixture_prior_params,更具体地说,sigma_1sigma_2pi,如果它们层的可训练参数是什么?

最佳答案

看完这个问题后Can I share weights between keras layers but have other parameters differ?及其答案(https://stackoverflow.com/a/45258859/3924118)并在模型训练后打印模型的可训练变量的值,这似乎是一种跨不同层共享变量的方法,因为该变量的值似乎在训练模型后,跨层相等。

我创建了一个简单示例(使用 TensorFlow 2.0.0 和 Keras 2.3.1)来展示这一点

import numpy as np
from keras import activations, initializers
from keras import backend as K
from keras import optimizers
from keras.layers import Input
from keras.layers import Layer
from keras.models import Model

shared_variable = K.variable([0.3], name='my_shared_variable')


class MyLayer(Layer):
def __init__(self, output_dim, activation=None, **kwargs):
self.output_dim = output_dim
self.activation = activations.get(activation)
super().__init__(**kwargs)

def build(self, input_shape):
self._trainable_weights.append(shared_variable)
self.my_weight = self.add_weight(name='my_weight',
shape=(input_shape[1], self.output_dim),
initializer=initializers.normal(),
trainable=True)
super().build(input_shape)

def call(self, x):
return self.activation(K.dot(x, self.my_weight * shared_variable))

def compute_output_shape(self, input_shape):
return input_shape[0], self.output_dim


if __name__ == "__main__":
# Define the architecture of the model.
x_in = Input(shape=(1,))
h1 = MyLayer(20, activation='relu')(x_in)
h2 = MyLayer(20, activation='relu')(h1)
x_out = MyLayer(1)(h2)

model = Model(x_in, x_out)
print("h1.trainable_weights (before training) =", model.layers[1].trainable_weights[0])
print("h2.trainable_weights (before training) =", model.layers[2].trainable_weights[0])

# Prepare the model for training.
model.compile(loss="mse", optimizer=optimizers.Adam(lr=0.03))

# Generate dataset.
X = np.linspace(-0.5, 0.5, 100).reshape(-1, 1)
y = 10 * np.sin(2 * np.pi * X)

# Train the model.
model.fit(X, y, batch_size=1, epochs=100, verbose=0)

print("h1.trainable_weights (after training) =", model.layers[1].trainable_weights[0])
print("h2.trainable_weights (after training) =", model.layers[2].trainable_weights[0])

输出是

h1.trainable_weights (before training) = <tf.Variable 'my_shared_variable:0' shape=(1,) dtype=float32, numpy=array([0.3], dtype=float32)>
h2.trainable_weights (before training) = <tf.Variable 'my_shared_variable:0' shape=(1,) dtype=float32, numpy=array([0.3], dtype=float32)>
h1.trainable_weights (after training) = <tf.Variable 'my_shared_variable:0' shape=(1,) dtype=float32, numpy=array([0.7049409], dtype=float32)>
h2.trainable_weights (after training) = <tf.Variable 'my_shared_variable:0' shape=(1,) dtype=float32, numpy=array([0.7049409], dtype=float32)>

关于keras - 为什么要将变量添加到层的 _trainable_weights 列表中?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/58530979/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com