gpt4 book ai didi

python - tf.layers.batch_normalization参数

转载 作者:行者123 更新时间:2023-11-30 08:39:39 25 4
gpt4 key购买 nike

我不确定是不是只有我自己认为tensorflow文档有点薄弱。

我打算使用tf.nn.batch_normalization函数来实现批处理规范化,但后来认识到tf.layers.batch_normalization函数似乎应该被用来简化它。但是,如果我可以说的话,文档确实很差。

我试图了解如何正确使用它,但是使用网页上提供的信息确实不容易。我希望也许其他人有经验并能帮助我(可能还有许多其他人)理解它。

首先让我分享界面:

tf.layers.batch_normalization(
inputs,
axis=-1,
momentum=0.99,
epsilon=0.001,
center=True,
scale=True,
beta_initializer=tf.zeros_initializer(),
gamma_initializer=tf.ones_initializer(),
moving_mean_initializer=tf.zeros_initializer(),
moving_variance_initializer=tf.ones_initializer(),
beta_regularizer=None,
gamma_regularizer=None,
beta_constraint=None,
gamma_constraint=None,
training=False,
trainable=True,
name=None,
reuse=None,
renorm=False,
renorm_clipping=None,
renorm_momentum=0.99,
fused=None,
virtual_batch_size=None,
adjustment=None
)


Q1)将beta值初始化为零,将gamma值初始化为1。但是没有说明原因。当使用批量归一化时,我了解到神经网络的普通偏差参数已过时,并且在批量归一化步骤中,β参数的作用相同。从这个角度来看,将beta设置为零是可以理解的。但是,为什么将伽玛值初始化为1?那真的是最有效的方法吗?

Q2)我也看到了动量参数。该文档只是说“动量为移动平均线。”。我假设在计算相应隐藏层中某个小型批次的“平均值”值时使用了此参数。换句话说,批次标准化中使用的平均值不是当前迷你批次的平均值,而是主要是最后100个迷你批次的平均值(因为动量= 0.99)。但是,尚不清楚该参数如何影响测试的执行,或者我是否只是通过计算成本和准确性来验证开发集上的模型。我的假设是,每当我处理测试集和开发集时,我都将参数“ training”设置为False,以使动量参数对于特定的执行变得过时,并且使用在训练期间计算出的“ mean”和“ variance”值现在,无需计算新的均值和方差值。如果我弄错了,应该是这样,但是在这种情况下,我在文档中看不到任何内容。谁能确认我的理解正确吗?如果没有,我将不胜感激对此作进一步的解释。

Q3)我很难给可训练参数赋予含义。我假设这里是beta和gamma参数。为什么它们不能训练?

Q4)“重用”参数。到底是什么

Q5)调整参数。另一件事

Q5)一种总结性问题。这是我的总体假设,需要确认和反馈。这里的重要参数是:
-输入
-轴
-动量
- 中央
-规模
-训练
而且我认为只要训练=训练时是正确的,我们就是安全的。只要在验证开发集或测试集时甚至在现实生活中使用模型时训练为False,我们也是安全的。

任何反馈将不胜感激。

附录:

混乱继续。救命!

我正在尝试使用此功能,而不是手动实现批处理规范器。我具有以下遍历NN层的正向传播函数。

def forward_propagation_with_relu(X, num_units_in_layers, parameters, 
normalize_batch, training, mb_size=7):

L = len(num_units_in_layers)

A_temp = tf.transpose(X)

for i in range (1, L):
W = parameters.get("W"+str(i))
b = parameters.get("b"+str(i))
Z_temp = tf.add(tf.matmul(W, A_temp), b)

if normalize_batch:
if (i < (L-1)):
with tf.variable_scope("batch_norm_scope", reuse=tf.AUTO_REUSE):
Z_temp = tf.layers.batch_normalization(Z_temp, axis=-1,
training=training)

A_temp = tf.nn.relu(Z_temp)

return Z_temp #This is the linear output of last layer


tf.layers.batch_normalization(..)函数希望具有静态尺寸,但在我的情况下我没有。

由于我每次运行优化程序都是应用小批量而不是训练整个训练集,因此X的1维似乎是未知的。

如果我写:

print(X.shape)


我得到:

(?, 5)


在这种情况下,当我运行整个程序时,在下面出现以下错误。

我在其他一些线程中看到有人说他们可以使用tf.reshape函数解决问题。我尝试一下。转发道具很好,但是后来在Adam Optimizer中崩溃了。

这是我运行上面的代码时得到的(不使用tf.reshape):

我该如何解决???

---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-191-990fb7d7f7f6> in <module>()
24 parameters = nn_model(train_input_paths, dev_input_paths, test_input_paths, learning_rate, num_train_epochs,
25 normalize_batch, epoch_period_to_save_cost, minibatch_size, num_units_in_layers,
---> 26 lambd, print_progress)
27
28 print(parameters)

<ipython-input-190-59594e979129> in nn_model(train_input_paths, dev_input_paths, test_input_paths, learning_rate, num_train_epochs, normalize_batch, epoch_period_to_save_cost, minibatch_size, num_units_in_layers, lambd, print_progress)
34 # Forward propagation: Build the forward propagation in the tensorflow graph
35 ZL = forward_propagation_with_relu(X_mini_batch, num_units_in_layers,
---> 36 parameters, normalize_batch, training)
37
38 with tf.name_scope("calc_cost"):

<ipython-input-187-8012e2fb6236> in forward_propagation_with_relu(X, num_units_in_layers, parameters, normalize_batch, training, mb_size)
15 with tf.variable_scope("batch_norm_scope", reuse=tf.AUTO_REUSE):
16 Z_temp = tf.layers.batch_normalization(Z_temp, axis=-1,
---> 17 training=training)
18
19 A_temp = tf.nn.relu(Z_temp)

~/.local/lib/python3.5/site-packages/tensorflow/python/layers/normalization.py in batch_normalization(inputs, axis, momentum, epsilon, center, scale, beta_initializer, gamma_initializer, moving_mean_initializer, moving_variance_initializer, beta_regularizer, gamma_regularizer, beta_constraint, gamma_constraint, training, trainable, name, reuse, renorm, renorm_clipping, renorm_momentum, fused, virtual_batch_size, adjustment)
775 _reuse=reuse,
776 _scope=name)
--> 777 return layer.apply(inputs, training=training)
778
779

~/.local/lib/python3.5/site-packages/tensorflow/python/layers/base.py in apply(self, inputs, *args, **kwargs)
805 Output tensor(s).
806 """
--> 807 return self.__call__(inputs, *args, **kwargs)
808
809 def _add_inbound_node(self,

~/.local/lib/python3.5/site-packages/tensorflow/python/layers/base.py in __call__(self, inputs, *args, **kwargs)
676 self._defer_regularizers = True
677 with ops.init_scope():
--> 678 self.build(input_shapes)
679 # Create any regularizers added by `build`.
680 self._maybe_create_variable_regularizers()

~/.local/lib/python3.5/site-packages/tensorflow/python/layers/normalization.py in build(self, input_shape)
251 if axis_to_dim[x] is None:
252 raise ValueError('Input has undefined `axis` dimension. Input shape: ',
--> 253 input_shape)
254 self.input_spec = base.InputSpec(ndim=ndims, axes=axis_to_dim)
255

ValueError: ('Input has undefined `axis` dimension. Input shape: ', TensorShape([Dimension(6), Dimension(None)]))


这真是无望。

附录(2)

我要添加更多信息:

以下内容仅表示输入层有5个单元,每个隐藏层有6个单元,输出层有2个单元。

num_units_in_layers = [5,6,6,2] 


这是带有tf.reshape的前向prop函数的更新版本

def forward_propagation_with_relu(X, num_units_in_layers, parameters, 
normalize_batch, training, mb_size=7):

L = len(num_units_in_layers)
print("X.shape before reshape: ", X.shape) # ADDED LINE 1
X = tf.reshape(X, [mb_size, num_units_in_layers[0]]) # ADDED LINE 2
print("X.shape after reshape: ", X.shape) # ADDED LINE 3
A_temp = tf.transpose(X)

for i in range (1, L):
W = parameters.get("W"+str(i))
b = parameters.get("b"+str(i))
Z_temp = tf.add(tf.matmul(W, A_temp), b)

if normalize_batch:
if (i < (L-1)):
with tf.variable_scope("batch_norm_scope", reuse=tf.AUTO_REUSE):
Z_temp = tf.layers.batch_normalization(Z_temp, axis=-1,
training=training)

A_temp = tf.nn.relu(Z_temp)

return Z_temp #This is the linear output of last layer


当我这样做时,我可以运行前向prop函数。但是在以后的执行中似乎崩溃了。这是我得到的错误。 (请注意,在前向prop函数中重塑前后,我会打印出输入X的形状)。

X.shape before reshape:  (?, 5)
X.shape after reshape: (7, 5)

---------------------------------------------------------------------------
InvalidArgumentError Traceback (most recent call last)
~/.local/lib/python3.5/site-packages/tensorflow/python/client/session.py in _do_call(self, fn, *args)
1349 try:
-> 1350 return fn(*args)
1351 except errors.OpError as e:

~/.local/lib/python3.5/site-packages/tensorflow/python/client/session.py in _run_fn(session, feed_dict, fetch_list, target_list, options, run_metadata)
1328 feed_dict, fetch_list, target_list,
-> 1329 status, run_metadata)
1330

~/.local/lib/python3.5/site-packages/tensorflow/python/framework/errors_impl.py in __exit__(self, type_arg, value_arg, traceback_arg)
515 compat.as_text(c_api.TF_Message(self.status.status)),
--> 516 c_api.TF_GetCode(self.status.status))
517 # Delete the underlying status object from memory otherwise it stays alive

InvalidArgumentError: Incompatible shapes: [7] vs. [2]
[[Node: forward_prop/batch_norm_scope/batch_normalization/cond_2/AssignMovingAvg/sub = Sub[T=DT_FLOAT, _class=["loc:@batch_norm_scope/batch_normalization/moving_mean"], _device="/job:localhost/replica:0/task:0/device:CPU:0"](forward_prop/batch_norm_scope/batch_normalization/cond_2/Switch_1:1, forward_prop/batch_norm_scope/batch_normalization/cond_2/AssignMovingAvg/sub/Switch_1:1)]]

During handling of the above exception, another exception occurred:

InvalidArgumentError Traceback (most recent call last)
<ipython-input-222-990fb7d7f7f6> in <module>()
24 parameters = nn_model(train_input_paths, dev_input_paths, test_input_paths, learning_rate, num_train_epochs,
25 normalize_batch, epoch_period_to_save_cost, minibatch_size, num_units_in_layers,
---> 26 lambd, print_progress)
27
28 print(parameters)

<ipython-input-221-59594e979129> in nn_model(train_input_paths, dev_input_paths, test_input_paths, learning_rate, num_train_epochs, normalize_batch, epoch_period_to_save_cost, minibatch_size, num_units_in_layers, lambd, print_progress)
88 cost_mini_batch,
89 accuracy_mini_batch],
---> 90 feed_dict={training: True})
91 nr_of_minibatches += 1
92 sum_minibatch_costs += minibatch_cost

~/.local/lib/python3.5/site-packages/tensorflow/python/client/session.py in run(self, fetches, feed_dict, options, run_metadata)
893 try:
894 result = self._run(None, fetches, feed_dict, options_ptr,
--> 895 run_metadata_ptr)
896 if run_metadata:
897 proto_data = tf_session.TF_GetBuffer(run_metadata_ptr)

~/.local/lib/python3.5/site-packages/tensorflow/python/client/session.py in _run(self, handle, fetches, feed_dict, options, run_metadata)
1126 if final_fetches or final_targets or (handle and feed_dict_tensor):
1127 results = self._do_run(handle, final_targets, final_fetches,
-> 1128 feed_dict_tensor, options, run_metadata)
1129 else:
1130 results = []

~/.local/lib/python3.5/site-packages/tensorflow/python/client/session.py in _do_run(self, handle, target_list, fetch_list, feed_dict, options, run_metadata)
1342 if handle is None:
1343 return self._do_call(_run_fn, self._session, feeds, fetches, targets,
-> 1344 options, run_metadata)
1345 else:
1346 return self._do_call(_prun_fn, self._session, handle, feeds, fetches)

~/.local/lib/python3.5/site-packages/tensorflow/python/client/session.py in _do_call(self, fn, *args)
1361 except KeyError:
1362 pass
-> 1363 raise type(e)(node_def, op, message)
1364
1365 def _extend_graph(self):

InvalidArgumentError: Incompatible shapes: [7] vs. [2]
[[Node: forward_prop/batch_norm_scope/batch_normalization/cond_2/AssignMovingAvg/sub = Sub[T=DT_FLOAT, _class=["loc:@batch_norm_scope/batch_normalization/moving_mean"], _device="/job:localhost/replica:0/task:0/device:CPU:0"](forward_prop/batch_norm_scope/batch_normalization/cond_2/Switch_1:1, forward_prop/batch_norm_scope/batch_normalization/cond_2/AssignMovingAvg/sub/Switch_1:1)]]

Caused by op 'forward_prop/batch_norm_scope/batch_normalization/cond_2/AssignMovingAvg/sub', defined at:
File "/home/cesncn/anaconda3/envs/tensorflow/lib/python3.5/runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "/home/cesncn/anaconda3/envs/tensorflow/lib/python3.5/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/cesncn/anaconda3/envs/tensorflow/lib/python3.5/site-packages/ipykernel_launcher.py", line 16, in <module>
app.launch_new_instance()
File "/home/cesncn/anaconda3/envs/tensorflow/lib/python3.5/site-packages/traitlets/config/application.py", line 658, in launch_instance
app.start()
File "/home/cesncn/anaconda3/envs/tensorflow/lib/python3.5/site-packages/ipykernel/kernelapp.py", line 478, in start
self.io_loop.start()
File "/home/cesncn/anaconda3/envs/tensorflow/lib/python3.5/site-packages/zmq/eventloop/ioloop.py", line 177, in start
super(ZMQIOLoop, self).start()
File "/home/cesncn/anaconda3/envs/tensorflow/lib/python3.5/site-packages/tornado/ioloop.py", line 888, in start
handler_func(fd_obj, events)
File "/home/cesncn/anaconda3/envs/tensorflow/lib/python3.5/site-packages/tornado/stack_context.py", line 277, in null_wrapper
return fn(*args, **kwargs)
File "/home/cesncn/anaconda3/envs/tensorflow/lib/python3.5/site-packages/zmq/eventloop/zmqstream.py", line 440, in _handle_events
self._handle_recv()
File "/home/cesncn/anaconda3/envs/tensorflow/lib/python3.5/site-packages/zmq/eventloop/zmqstream.py", line 472, in _handle_recv
self._run_callback(callback, msg)
File "/home/cesncn/anaconda3/envs/tensorflow/lib/python3.5/site-packages/zmq/eventloop/zmqstream.py", line 414, in _run_callback
callback(*args, **kwargs)
File "/home/cesncn/anaconda3/envs/tensorflow/lib/python3.5/site-packages/tornado/stack_context.py", line 277, in null_wrapper
return fn(*args, **kwargs)
File "/home/cesncn/anaconda3/envs/tensorflow/lib/python3.5/site-packages/ipykernel/kernelbase.py", line 283, in dispatcher
return self.dispatch_shell(stream, msg)
File "/home/cesncn/anaconda3/envs/tensorflow/lib/python3.5/site-packages/ipykernel/kernelbase.py", line 233, in dispatch_shell
handler(stream, idents, msg)
File "/home/cesncn/anaconda3/envs/tensorflow/lib/python3.5/site-packages/ipykernel/kernelbase.py", line 399, in execute_request
user_expressions, allow_stdin)
File "/home/cesncn/anaconda3/envs/tensorflow/lib/python3.5/site-packages/ipykernel/ipkernel.py", line 208, in do_execute
res = shell.run_cell(code, store_history=store_history, silent=silent)
File "/home/cesncn/anaconda3/envs/tensorflow/lib/python3.5/site-packages/ipykernel/zmqshell.py", line 537, in run_cell
return super(ZMQInteractiveShell, self).run_cell(*args, **kwargs)
File "/home/cesncn/anaconda3/envs/tensorflow/lib/python3.5/site-packages/IPython/core/interactiveshell.py", line 2728, in run_cell
interactivity=interactivity, compiler=compiler, result=result)
File "/home/cesncn/anaconda3/envs/tensorflow/lib/python3.5/site-packages/IPython/core/interactiveshell.py", line 2850, in run_ast_nodes
if self.run_code(code, result):
File "/home/cesncn/anaconda3/envs/tensorflow/lib/python3.5/site-packages/IPython/core/interactiveshell.py", line 2910, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-222-990fb7d7f7f6>", line 26, in <module>
lambd, print_progress)
File "<ipython-input-221-59594e979129>", line 36, in nn_model
parameters, normalize_batch, training)
File "<ipython-input-218-62e4c6126c2c>", line 19, in forward_propagation_with_relu
training=training)
File "/home/cesncn/.local/lib/python3.5/site-packages/tensorflow/python/layers/normalization.py", line 777, in batch_normalization
return layer.apply(inputs, training=training)
File "/home/cesncn/.local/lib/python3.5/site-packages/tensorflow/python/layers/base.py", line 807, in apply
return self.__call__(inputs, *args, **kwargs)
File "/home/cesncn/.local/lib/python3.5/site-packages/tensorflow/python/layers/base.py", line 697, in __call__
outputs = self.call(inputs, *args, **kwargs)
File "/home/cesncn/.local/lib/python3.5/site-packages/tensorflow/python/layers/normalization.py", line 602, in call
lambda: self.moving_mean)
File "/home/cesncn/.local/lib/python3.5/site-packages/tensorflow/python/layers/utils.py", line 211, in smart_cond
return control_flow_ops.cond(pred, true_fn=fn1, false_fn=fn2, name=name)
File "/home/cesncn/.local/lib/python3.5/site-packages/tensorflow/python/util/deprecation.py", line 316, in new_func
return func(*args, **kwargs)
File "/home/cesncn/.local/lib/python3.5/site-packages/tensorflow/python/ops/control_flow_ops.py", line 1985, in cond
orig_res_t, res_t = context_t.BuildCondBranch(true_fn)
File "/home/cesncn/.local/lib/python3.5/site-packages/tensorflow/python/ops/control_flow_ops.py", line 1839, in BuildCondBranch
original_result = fn()
File "/home/cesncn/.local/lib/python3.5/site-packages/tensorflow/python/layers/normalization.py", line 601, in <lambda>
lambda: _do_update(self.moving_mean, new_mean),
File "/home/cesncn/.local/lib/python3.5/site-packages/tensorflow/python/layers/normalization.py", line 597, in _do_update
var, value, self.momentum, zero_debias=False)
File "/home/cesncn/.local/lib/python3.5/site-packages/tensorflow/python/training/moving_averages.py", line 87, in assign_moving_average
update_delta = (variable - value) * decay
File "/home/cesncn/.local/lib/python3.5/site-packages/tensorflow/python/ops/variables.py", line 778, in _run_op
return getattr(ops.Tensor, operator)(a._AsTensor(), *args)
File "/home/cesncn/.local/lib/python3.5/site-packages/tensorflow/python/ops/math_ops.py", line 934, in binary_op_wrapper
return func(x, y, name=name)
File "/home/cesncn/.local/lib/python3.5/site-packages/tensorflow/python/ops/gen_math_ops.py", line 4819, in _sub
"Sub", x=x, y=y, name=name)
File "/home/cesncn/.local/lib/python3.5/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "/home/cesncn/.local/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 3267, in create_op
op_def=op_def)
File "/home/cesncn/.local/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 1650, in __init__
self._traceback = self._graph._extract_stack() # pylint: disable=protected-access

InvalidArgumentError (see above for traceback): Incompatible shapes: [7] vs. [2]
[[Node: forward_prop/batch_norm_scope/batch_normalization/cond_2/AssignMovingAvg/sub = Sub[T=DT_FLOAT, _class=["loc:@batch_norm_scope/batch_normalization/moving_mean"], _device="/job:localhost/replica:0/task:0/device:CPU:0"](forward_prop/batch_norm_scope/batch_normalization/cond_2/Switch_1:1, forward_prop/batch_norm_scope/batch_normalization/cond_2/AssignMovingAvg/sub/Switch_1:1)]]


关于为什么X的形状不是静态的问题..我不知道...
这里是我设置数据集的方式。

with tf.name_scope("next_train_batch"):
filenames = tf.placeholder(tf.string, shape=[None])
dataset = tf.data.Dataset.from_tensor_slices(filenames)
dataset = dataset.flat_map(lambda filename: tf.data.TextLineDataset(filename).skip(1).map(decode_csv))
dataset = dataset.shuffle(buffer_size=1000)
dataset = dataset.batch(minibatch_size)
iterator = dataset.make_initializable_iterator()
X_mini_batch, Y_mini_batch = iterator.get_next()


我有2个包含火车数据的csv文件。

train_path1 = "train1.csv"
train_path2 = "train2.csv"
train_input_paths = [train_path1, train_path2]


我使用可初始化的迭代器,如下所示:

sess.run(iterator.initializer, 
feed_dict={filenames: train_input_paths})


在训练过程中,我不断从训练台上获取小批量产品。当我禁用批量标准化时,一切正常。如果启用批处理规范,则需要输入X(小型批处理)的静态形状。我重塑了它,但是这次它在稍后的执行中崩溃,如上所示。

附录(3)

我想我知道它崩溃的地方。在计算成本后运行优化器时,它可能会崩溃。

首先是命令序列:
首先转发道具,然后计算成本,然后运行优化器。前2个似乎在工作,但优化器却没有。

这里是我定义优化器的方式:

with tf.name_scope("train"):
update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS)
with tf.control_dependencies(update_ops):
# Backpropagation: Define the tensorflow optimizer. Use an AdamOptimizer.
optimizer = tf.train.AdamOptimizer(learning_rate = learning_rate).minimize(cost_mini_batch)


我那里有update_ops,可以更新移动平均线。如果我解释正确,它会在尝试更新移动平均线时崩溃。我可能也会误解错误味精。

附录(4)

我尝试根据已知维度进行归一化,并且有效!但这不是我要标准化的维度,这现在令人困惑。让我详细说明:

输入层中单位的nr:5
第1层(第一个隐藏层)中单位的nr:6
所以weight1是(6,5)矩阵
假设最小批量为7。
在我的情况下,A [0](或X_mini_batch)的形状是:(7,5),其中7是迷你批次中的#个训练样本,而5是输入层中的#个单位。

计算Z [1]时...
Z [1] =权重1 * A [0]。转置
...然后Z [1]的形状为(6,7)矩阵,其中每列为每个训练样本提供6个特征。

问题是我们要在Z [1]中规范化哪一列?对我来说有意义的是,您可以对所有给定火车样本中的每个特征进行归一化。这意味着我需要归一化每一行,因为每行中的不同训练示例具有不同的特征值。并且由于Z [1]的形状为(6,7),所以如果我将axis设置为0,则应该在每一行中引用归一化。在我的情况下,7是未知数,因此它不会受到伤害。基于此逻辑,它可以工作!但是,如果axis = 0确实指向这里的每一行,我都会感到非常困惑。让我展示一下有关此轴问题的另一个示例,它困扰了我很长时间。

与本主题代码示例无关的:

cc = tf.constant([[1.,2.,3.], 
[4.,5.,6.]])

with tf.Session() as sess:
print(sess.run(tf.reduce_mean(cc, axis=0)))
print(sess.run(tf.reduce_mean(cc, axis=1)))


这给出以下输出:

[2.5 3.5 4.5]
[2. 5.]


当我将axis设置为0时,它将给出每一列的平均值。如果axis = 1,则表示每一行的平均值。

(请注意cc.shape给出(2,3))

现在的百万美元问题:在二维矩阵中,当我要寻址每一行时,轴是0还是1?

附录(5)
我想我现在正确了。让我在这里总结一下我对轴的理解。希望我现在就知道了...

这是形状为(6,7)的Z [1]矩阵表示形式:

t_ex:火车示例
f:功能

t_ex1   t_ex2   t_ex3   t_ex4   t_ex5   t_ex6   t_ex7
f1 f1 f1 f1 f1 f1 f1
f2 f2 f2 f2 f2 f2 f2
f3 f3 f3 f3 f3 f3 f3
f4 f4 f4 f4 f4 f4 f4
f5 f5 f5 f5 f5 f5 f5
f6 f6 f6 f6 f6 f6 f6


在上面的迷你批处理中,有7个火车示例,每个火车ex具有6个功能(因为第1层中有6个单元)。当我们说“ tf.layers.batch_normalization(..,axis = 0)”时,我们的意思是必须对每项特征进行每行归一化,以消除第一行中-f1值之间的高方差。

换句话说,我们不会将f1,f2,f3,f4,f5,f6相互标准化。我们将f1:s相互归一化,并将f2:s相互归一化,依此类推。

最佳答案

Q1)将gamma初始化为1,将beta初始化为0意味着直接使用标准化输入。由于没有关于图层输出的方差应该是什么的先验信息,因此假设采用标准的高斯就足够了。

Q2)在训练阶段(training=True),假设训练数据是随机抽样的,则使用各自的均值和var对批次进行归一化。在测试(training=False)期间,由于可以任意采样测试数据,因此我们不能使用它们的均值和var。因此,正如您所说,我们使用上一次“ 100”次训练迭代的移动平均估计。

Q3)是的,可训练是指betagamma。在某些情况下需要设置trainable=False,例如是否使用新颖的方法来更新参数,或者是否已对batch_norm层进行了预训练并需要冻结。

Q4)您可能还注意到其他reuse功能中的tf.layers参数。通常,如果您想多次调用一个图层(例如训练和验证),并且不想让TensorFlow认为自己正在创建新图层,请设置reuse=True。我更喜欢with tf.variable_scope(..., reuse=tf.AUTO_REUSE):来达到相同的目的。

Q5)我不确定这一点。我猜这是给那些想要设计新技巧来调整规模和偏见的用户的。

Q6)是的,您是对的。

关于python - tf.layers.batch_normalization参数,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/49701918/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com