gpt4 book ai didi

python - MaxPooling2D、Conv2D、UpSampling2D层的输出大小是如何计算的?

转载 作者:行者123 更新时间:2023-12-03 14:38:56 25 4
gpt4 key购买 nike

我正在学习卷积自动编码器,并且正在使用 keras 构建图像降噪器。
以下代码适用于构建模型:

denoiser.add(Conv2D(32, (3,3), input_shape=(28,28,1), padding='same')) 
denoiser.add(Activation('relu'))
denoiser.add(MaxPooling2D(pool_size=(2,2)))

denoiser.add(Conv2D(16, (3,3), padding='same'))
denoiser.add(Activation('relu'))
denoiser.add(MaxPooling2D(pool_size=(2,2)))

denoiser.add(Conv2D(8, (3,3), padding='same'))
denoiser.add(Activation('relu'))

################## HEY WHAT NO MAXPOOLING?

denoiser.add(Conv2D(8, (3,3), padding='same'))
denoiser.add(Activation('relu'))
denoiser.add(UpSampling2D((2,2)))

denoiser.add(Conv2D(16, (3,3), padding='same'))
denoiser.add(Activation('relu'))
denoiser.add(UpSampling2D((2,2)))

denoiser.add(Conv2D(1, (3,3), padding='same'))

denoiser.compile(optimizer='adam', loss='mean_squared_error', metrics=['accuracy'])
denoiser.summary()

并给出以下总结:
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_155 (Conv2D) (None, 28, 28, 32) 320
_________________________________________________________________
activation_162 (Activation) (None, 28, 28, 32) 0
_________________________________________________________________
max_pooling2d_99 (MaxPooling (None, 14, 14, 32) 0
_________________________________________________________________
conv2d_156 (Conv2D) (None, 14, 14, 16) 4624
_________________________________________________________________
activation_163 (Activation) (None, 14, 14, 16) 0
_________________________________________________________________
max_pooling2d_100 (MaxPoolin (None, 7, 7, 16) 0
_________________________________________________________________
conv2d_157 (Conv2D) (None, 7, 7, 8) 1160
_________________________________________________________________
activation_164 (Activation) (None, 7, 7, 8) 0
_________________________________________________________________
conv2d_158 (Conv2D) (None, 7, 7, 8) 584
_________________________________________________________________
activation_165 (Activation) (None, 7, 7, 8) 0
_________________________________________________________________
up_sampling2d_25 (UpSampling (None, 14, 14, 8) 0
_________________________________________________________________
conv2d_159 (Conv2D) (None, 14, 14, 16) 1168
_________________________________________________________________
activation_166 (Activation) (None, 14, 14, 16) 0
_________________________________________________________________
up_sampling2d_26 (UpSampling (None, 28, 28, 16) 0
_________________________________________________________________
conv2d_160 (Conv2D) (None, 28, 28, 1) 145
=================================================================
Total params: 8,001
Trainable params: 8,001
Non-trainable params: 0
_________________________________________________________________

我不知道如何 MaxPooling2D , Conv2D , UpSampling2D计算输出尺寸。我已经阅读了 keras 文档,但我仍然感到困惑。影响输出形状的参数有很多,比如 stridepadding对于 Conv2D 层,我不知道它究竟如何影响输出形状。

我不明白为什么没有 MaxPooling2D注释行之前的图层。编辑代码以包含 convmodel3.add(MaxPooling2D(pool_size=(2,2)))注释上方的图层,它将最终输出形状变为 (None, 12, 12, 1)

编辑代码以包含 convmodel3.add(MaxPooling2D(pool_size=(2,2)))评论之前的图层,然后是 convmodel3.add(UpSampling2D((2,2)))将最终输出变为 (None, 24, 24, 1)。这不应该是 (None, 28, 28, 1) 吗?
代码和总结:
convmodel3 = Sequential()
convmodel3.add(Conv2D(32, (3,3), input_shape=(28,28,1), padding='same'))
convmodel3.add(Activation('relu'))
convmodel3.add(MaxPooling2D(pool_size=(2,2)))

convmodel3.add(Conv2D(16, (3,3), padding='same'))
convmodel3.add(Activation('relu'))
convmodel3.add(MaxPooling2D(pool_size=(2,2)))

convmodel3.add(Conv2D(8, (3,3), padding='same'))
convmodel3.add(Activation('relu'))
convmodel3.add(MaxPooling2D(pool_size=(2,2))) # ADDED MAXPOOL

################## HEY WHAT NO MAXPOOLING?

convmodel3.add(UpSampling2D((2,2))) # ADDED UPSAMPLING
convmodel3.add(Conv2D(16, (3,3), padding='same'))
convmodel3.add(Activation('relu'))
convmodel3.add(UpSampling2D((2,2)))

convmodel3.add(Conv2D(32, (3,3), padding='same'))
convmodel3.add(Activation('relu'))
convmodel3.add(UpSampling2D((2,2)))

convmodel3.add(Conv2D(1, (3,3), padding='same'))

convmodel3.compile(optimizer='adam', loss='mean_squared_error', metrics=['accuracy'])
convmodel3.summary()

_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_247 (Conv2D) (None, 28, 28, 32) 320
_________________________________________________________________
activation_238 (Activation) (None, 28, 28, 32) 0
_________________________________________________________________
max_pooling2d_141 (MaxPoolin (None, 14, 14, 32) 0
_________________________________________________________________
conv2d_248 (Conv2D) (None, 14, 14, 16) 4624
_________________________________________________________________
activation_239 (Activation) (None, 14, 14, 16) 0
_________________________________________________________________
max_pooling2d_142 (MaxPoolin (None, 7, 7, 16) 0
_________________________________________________________________
conv2d_249 (Conv2D) (None, 7, 7, 8) 1160
_________________________________________________________________
activation_240 (Activation) (None, 7, 7, 8) 0
_________________________________________________________________
max_pooling2d_143 (MaxPoolin (None, 3, 3, 8) 0
_________________________________________________________________
up_sampling2d_60 (UpSampling (None, 6, 6, 8) 0
_________________________________________________________________
conv2d_250 (Conv2D) (None, 6, 6, 16) 1168
_________________________________________________________________
activation_241 (Activation) (None, 6, 6, 16) 0
_________________________________________________________________
up_sampling2d_61 (UpSampling (None, 12, 12, 16) 0
_________________________________________________________________
conv2d_251 (Conv2D) (None, 12, 12, 32) 4640
_________________________________________________________________
activation_242 (Activation) (None, 12, 12, 32) 0
_________________________________________________________________
up_sampling2d_62 (UpSampling (None, 24, 24, 32) 0
_________________________________________________________________
conv2d_252 (Conv2D) (None, 24, 24, 1) 289
=================================================================
Total params: 12,201
Trainable params: 12,201
Non-trainable params: 0
_________________________________________________________________
None有什么意义在输出形状?

另外,编辑 Conv2D层不包括填充,会引发错误:
ValueError: Negative dimension size caused by subtracting 3 from 2 for 'conv2d_240/convolution' (op: 'Conv2D') with input shapes: [?,2,2,16], [3,3,16,32].
为什么?

最佳答案

对于卷积(此处为 2D)层,要考虑的重点是图像的体积(宽度 x 高度 x 深度)以及您为其提供的四个参数。这些参数是

  • 过滤器数量 K
  • 过滤器尺寸(空间) F
  • 过滤器在 S 处移动的步幅
  • 零填充 P

  • 输出形状的公式为
  • Wnew = (W - F + 2*P)/S + 1
  • Hnew = (H - F + 2*P)/S + 1
  • Dnew = K

  • 这是取自此线程 what is the effect of tf.nn.conv2d() on an input tensor shape? ,以及有关零填充等的更多信息可以在那里找到。

    对于 maxpooling 和 upsampling,大小仅受池大小和步幅的影响。在您的示例中,您的池大小为 (2,2) 且未定义步幅(因此默认为池大小,请参见此处 https://keras.io/layers/pooling/ )。上采样的工作原理相同。池大小只需要一个 2x2 像素的池,找到它们的总和并将它们放入一个像素中。因此将 2x2 像素转换为 1x1 像素,对其进行编码。上采样是同一件事,但不是对像素值求和,而是在池中重复这些值。

    您没有 maxpooling 层以及图像尺寸在您的情况下困惑的原因是由于该阶段的图像大小。查看网络,图像尺寸已经是[7,7,8]。池大小和步长分别为 (2,2) 和 2,这会将图像的分辨率降低到 [3,3,8]。在上采样层之后,维度将从 3 -> 6 -> 12 -> 24 开始,每行和每列都丢失了 4 个像素。

    None 的重要性(如果我错了,请纠正我,我不是 100% 确定)是由于网络通常在卷积层期望多个图像。通常预期的维度为
    [Number of images, Width, Height, Depth]

    因此,第一个元素被指定为 none 的原因是您的网络一次只期望一个图像,因此它被指定为 None (同样,我对这一点非常不确定)。

    关于python - MaxPooling2D、Conv2D、UpSampling2D层的输出大小是如何计算的?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/54423078/

    25 4 0
    Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
    广告合作:1813099741@qq.com 6ren.com