- mongodb - 在 MongoDB mapreduce 中,如何展平值对象?
- javascript - 对象传播与 Object.assign
- html - 输入类型 ="submit"Vs 按钮标签它们可以互换吗?
- sql - 使用 MongoDB 而不是 MS SQL Server 的优缺点
在keras.applications
中,有一个在imagenet上预训练的VGG16模型。
from keras.applications import VGG16
model = VGG16(weights='imagenet')
此模型具有以下结构。
Layer (type) Output Shape Param # Connected to
====================================================================================================
input_1 (InputLayer) (None, 3, 224, 224) 0
____________________________________________________________________________________________________
block1_conv1 (Convolution2D) (None, 64, 224, 224) 1792 input_1[0][0]
____________________________________________________________________________________________________
block1_conv2 (Convolution2D) (None, 64, 224, 224) 36928 block1_conv1[0][0]
____________________________________________________________________________________________________
block1_pool (MaxPooling2D) (None, 64, 112, 112) 0 block1_conv2[0][0]
____________________________________________________________________________________________________
block2_conv1 (Convolution2D) (None, 128, 112, 112) 73856 block1_pool[0][0]
____________________________________________________________________________________________________
block2_conv2 (Convolution2D) (None, 128, 112, 112) 147584 block2_conv1[0][0]
____________________________________________________________________________________________________
block2_pool (MaxPooling2D) (None, 128, 56, 56) 0 block2_conv2[0][0]
____________________________________________________________________________________________________
block3_conv1 (Convolution2D) (None, 256, 56, 56) 295168 block2_pool[0][0]
____________________________________________________________________________________________________
block3_conv2 (Convolution2D) (None, 256, 56, 56) 590080 block3_conv1[0][0]
____________________________________________________________________________________________________
block3_conv3 (Convolution2D) (None, 256, 56, 56) 590080 block3_conv2[0][0]
____________________________________________________________________________________________________
block3_pool (MaxPooling2D) (None, 256, 28, 28) 0 block3_conv3[0][0]
____________________________________________________________________________________________________
block4_conv1 (Convolution2D) (None, 512, 28, 28) 1180160 block3_pool[0][0]
____________________________________________________________________________________________________
block4_conv2 (Convolution2D) (None, 512, 28, 28) 2359808 block4_conv1[0][0]
____________________________________________________________________________________________________
block4_conv3 (Convolution2D) (None, 512, 28, 28) 2359808 block4_conv2[0][0]
____________________________________________________________________________________________________
block4_pool (MaxPooling2D) (None, 512, 14, 14) 0 block4_conv3[0][0]
____________________________________________________________________________________________________
block5_conv1 (Convolution2D) (None, 512, 14, 14) 2359808 block4_pool[0][0]
____________________________________________________________________________________________________
block5_conv2 (Convolution2D) (None, 512, 14, 14) 2359808 block5_conv1[0][0]
____________________________________________________________________________________________________
block5_conv3 (Convolution2D) (None, 512, 14, 14) 2359808 block5_conv2[0][0]
____________________________________________________________________________________________________
block5_pool (MaxPooling2D) (None, 512, 7, 7) 0 block5_conv3[0][0]
____________________________________________________________________________________________________
flatten (Flatten) (None, 25088) 0 block5_pool[0][0]
____________________________________________________________________________________________________
fc1 (Dense) (None, 4096) 102764544 flatten[0][0]
____________________________________________________________________________________________________
fc2 (Dense) (None, 4096) 16781312 fc1[0][0]
____________________________________________________________________________________________________
predictions (Dense) (None, 1000) 4097000 fc2[0][0]
====================================================================================================
Total params: 138,357,544
Trainable params: 138,357,544
Non-trainable params: 0
____________________________________________________________________________________________________
我想在密集层(fc1、fc2 和预测)之间使用 dropout 层微调这个模型,同时保持模型的所有预训练权重不变。我知道可以使用 model.layers
单独访问每个层,但我还没有找到如何在现有层之间添加新层的任何地方。
这样做的最佳做法是什么?
最佳答案
我自己通过使用 Keras functional API 找到了答案
from keras.applications import VGG16
from keras.layers import Dropout
from keras.models import Model
model = VGG16(weights='imagenet')
# Store the fully connected layers
fc1 = model.layers[-3]
fc2 = model.layers[-2]
predictions = model.layers[-1]
# Create the dropout layers
dropout1 = Dropout(0.85)
dropout2 = Dropout(0.85)
# Reconnect the layers
x = dropout1(fc1.output)
x = fc2(x)
x = dropout2(x)
predictors = predictions(x)
# Create a new model
model2 = Model(input=model.input, output=predictors)
model2
有我想要的 dropout 层
____________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
====================================================================================================
input_1 (InputLayer) (None, 3, 224, 224) 0
____________________________________________________________________________________________________
block1_conv1 (Convolution2D) (None, 64, 224, 224) 1792 input_1[0][0]
____________________________________________________________________________________________________
block1_conv2 (Convolution2D) (None, 64, 224, 224) 36928 block1_conv1[0][0]
____________________________________________________________________________________________________
block1_pool (MaxPooling2D) (None, 64, 112, 112) 0 block1_conv2[0][0]
____________________________________________________________________________________________________
block2_conv1 (Convolution2D) (None, 128, 112, 112) 73856 block1_pool[0][0]
____________________________________________________________________________________________________
block2_conv2 (Convolution2D) (None, 128, 112, 112) 147584 block2_conv1[0][0]
____________________________________________________________________________________________________
block2_pool (MaxPooling2D) (None, 128, 56, 56) 0 block2_conv2[0][0]
____________________________________________________________________________________________________
block3_conv1 (Convolution2D) (None, 256, 56, 56) 295168 block2_pool[0][0]
____________________________________________________________________________________________________
block3_conv2 (Convolution2D) (None, 256, 56, 56) 590080 block3_conv1[0][0]
____________________________________________________________________________________________________
block3_conv3 (Convolution2D) (None, 256, 56, 56) 590080 block3_conv2[0][0]
____________________________________________________________________________________________________
block3_pool (MaxPooling2D) (None, 256, 28, 28) 0 block3_conv3[0][0]
____________________________________________________________________________________________________
block4_conv1 (Convolution2D) (None, 512, 28, 28) 1180160 block3_pool[0][0]
____________________________________________________________________________________________________
block4_conv2 (Convolution2D) (None, 512, 28, 28) 2359808 block4_conv1[0][0]
____________________________________________________________________________________________________
block4_conv3 (Convolution2D) (None, 512, 28, 28) 2359808 block4_conv2[0][0]
____________________________________________________________________________________________________
block4_pool (MaxPooling2D) (None, 512, 14, 14) 0 block4_conv3[0][0]
____________________________________________________________________________________________________
block5_conv1 (Convolution2D) (None, 512, 14, 14) 2359808 block4_pool[0][0]
____________________________________________________________________________________________________
block5_conv2 (Convolution2D) (None, 512, 14, 14) 2359808 block5_conv1[0][0]
____________________________________________________________________________________________________
block5_conv3 (Convolution2D) (None, 512, 14, 14) 2359808 block5_conv2[0][0]
____________________________________________________________________________________________________
block5_pool (MaxPooling2D) (None, 512, 7, 7) 0 block5_conv3[0][0]
____________________________________________________________________________________________________
flatten (Flatten) (None, 25088) 0 block5_pool[0][0]
____________________________________________________________________________________________________
fc1 (Dense) (None, 4096) 102764544 flatten[0][0]
____________________________________________________________________________________________________
dropout_1 (Dropout) (None, 4096) 0 fc1[0][0]
____________________________________________________________________________________________________
fc2 (Dense) (None, 4096) 16781312 dropout_1[0][0]
____________________________________________________________________________________________________
dropout_2 (Dropout) (None, 4096) 0 fc2[1][0]
____________________________________________________________________________________________________
predictions (Dense) (None, 1000) 4097000 dropout_2[0][0]
====================================================================================================
Total params: 138,357,544
Trainable params: 138,357,544
Non-trainable params: 0
____________________________________________________________________________________________________
关于python - 在 keras 的预训练密集层之间添加 dropout 层,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/42475381/
我遇到过上述术语,但不确定它们之间的区别。 我的理解是 MC dropout 是正常的 dropout,它在测试期间也是活跃的,允许我们在多次测试运行中得到模型不确定性的估计。至于 channel-w
我正在从 deeplearning.ai 学习神经网络中的正则化类(class)。在 dropout 正则化中,教授说如果应用 dropout,计算的激活值将小于未应用 dropout 时(测试时)。
有两种方法可以执行dropout: torch.nn.Dropout torch.nn.function.Dropout 我问: 它们之间有区别吗? 我什么时候应该使用其中一种而不是另一种? 当我切换
根据此链接,keep_prob 的值必须在 (0,1] 之间: Tensorflow manual 否则我会得到值错误: ValueError: If keep_prob is not in (0,
我想在训练时从每个批处理的顺序 Keras 模型中的 dropout 层中提取并存储 dropout mask [1/0 数组]。我想知道在 Keras 中是否有一种直接的方法可以做到这一点,或者我是
来自 Keras 文档: dropout:在 0 和 1 之间 float 。要丢弃的单位分数 输入的线性变换。 recurrent_dropout:在 0 和 1 之间 float 。 drop 用
keras中的Dropout层与dropout和recurrent_droput参数有什么区别?它们都有相同的目的吗? 示例: model.add(Dropout(0.2)) # layer mod
我很困惑是使用 tf.nn.dropout 还是 tf.layers.dropout。 许多 MNIST CNN 示例似乎使用 tf.nn.droput,将 keep_prop 作为参数之一。 但它与
我目前正在尝试使用 Keras( tensorflow 后端)建立一个(LSTM)循环神经网络。我想使用带有 MC Dropout 的变分 dropout。我相信变分 dropout 已经通过 LST
tensorflow config dropout wrapper具有可以设置的三种不同的丢失概率:input_keep_prob、output_keep_prob、state_keep_prob。
tensorflow config dropout wrapper具有可以设置的三种不同的丢失概率:input_keep_prob、output_keep_prob、state_keep_prob。
我想在我的网络中添加 word dropout,以便我可以有足够的训练示例来训练“unk”标记的嵌入。据我所知,这是标准做法。假设unk token的索引为0,padding的索引为1(方便的话我们可
dropout 层只应该在模型训练期间使用,而不是在测试期间使用。 如果我的 Keras 序列模型中有一个 dropout 层,我是否需要在做之前做一些事情来删除或沉默它 model.predict(
我试图了解辍学对验证平均绝对误差(非线性回归问题)的影响。 无辍学 辍学率为 0.05 辍学率为 0.075 在没有任何 dropouts 的情况下,验证损失大于训练损失,如1所示。我的理解是,验证损
玩具回归示例。使用 dropout=0.0 这很好用并且成本降低了。使用 dropout=0.5 我得到错误: ValueError: Got num_leading_axes=1 for a 1-d
如何在训练期间更改 Dropout?例如 Dropout= [0.1, 0.2, 0.3] 我尝试将其作为列表传递,但我无法使其工作。 最佳答案 要在训练过程中改变 dropout 概率,您应该使用
我有一个用多个 LayerNormalization 层训练的模型,我不确定在激活 dropout 进行预测时简单的权重转移是否正常工作。这是我正在使用的代码: from tensorflow.ker
我正在训练一个带有 dropout 的神经网络。碰巧的是,当我将 dropout 从 0.9 减少到 0.7 时,训练数据数据的损失(交叉验证错误)也会减少。我还注意到,随着我减少 dropout 参
根据 Keras 文档,dropout 层在训练和测试阶段表现出不同的行为: Note that if your model has a different behavior in training
我已经在多个地方看到您应该在验证和测试阶段禁用 dropout,并且只在训练阶段保留它。有什么理由让这种情况发生吗?我一直找不到一个很好的理由,只是想知道。 我问的一个原因是因为我训练了一个带有 dr
我是一名优秀的程序员,十分优秀!