gpt4 book ai didi

machine-learning - 使用 Keras 进行深度估计

转载 作者:行者123 更新时间:2023-11-30 09:10:39 26 4
gpt4 key购买 nike

我正在尝试设计一个卷积网络来使用 Keras 估计图像的深度。

我有形状为 3x120x160 的 RGB 输入图像,并且有形状为 1x120x160 的灰度输出深度图。

我尝试使用类似 VGG 的架构,其中每一层的深度都会增长,但最后当我想设计最后的层时,我陷入了困境。使用密集层太昂贵,我尝试使用上采样,但事实证明效率很低。

我想使用 DeConvolution2D,但无法让它工作。我最终得到的唯一架构是这样的:

    model = Sequential()
model.add(Convolution2D(64, 5, 5, activation='relu', input_shape=(3, 120, 160)))
model.add(Convolution2D(64, 5, 5, activation='relu'))
model.add(MaxPooling2D())
model.add(Dropout(0.5))

model.add(Convolution2D(128, 3, 3, activation='relu'))
model.add(Convolution2D(128, 3, 3, activation='relu'))
model.add(MaxPooling2D())
model.add(Dropout(0.5))

model.add(Convolution2D(256, 3, 3, activation='relu'))
model.add(Convolution2D(256, 3, 3, activation='relu'))
model.add(Dropout(0.5))

model.add(Convolution2D(512, 3, 3, activation='relu'))
model.add(Convolution2D(512, 3, 3, activation='relu'))
model.add(Dropout(0.5))

model.add(ZeroPadding2D())
model.add(Deconvolution2D(512, 3, 3, (None, 512, 41, 61), subsample=(2, 2), activation='relu'))
model.add(Deconvolution2D(512, 3, 3, (None, 512, 123, 183), subsample=(3, 3), activation='relu'))
model.add(cropping.Cropping2D(cropping=((1, 2), (11, 12))))
model.add(Convolution2D(1, 1, 1, activation='sigmoid', border_mode='same'))

模型摘要如下:

Layer (type)                     Output Shape          Param #     Connected to                     
====================================================================================================
convolution2d_1 (Convolution2D) (None, 64, 116, 156) 4864 convolution2d_input_1[0][0]
____________________________________________________________________________________________________
convolution2d_2 (Convolution2D) (None, 64, 112, 152) 102464 convolution2d_1[0][0]
____________________________________________________________________________________________________
maxpooling2d_1 (MaxPooling2D) (None, 64, 56, 76) 0 convolution2d_2[0][0]
____________________________________________________________________________________________________
dropout_1 (Dropout) (None, 64, 56, 76) 0 maxpooling2d_1[0][0]
____________________________________________________________________________________________________
convolution2d_3 (Convolution2D) (None, 128, 54, 74) 73856 dropout_1[0][0]
____________________________________________________________________________________________________
convolution2d_4 (Convolution2D) (None, 128, 52, 72) 147584 convolution2d_3[0][0]
____________________________________________________________________________________________________
maxpooling2d_2 (MaxPooling2D) (None, 128, 26, 36) 0 convolution2d_4[0][0]
____________________________________________________________________________________________________
dropout_2 (Dropout) (None, 128, 26, 36) 0 maxpooling2d_2[0][0]
____________________________________________________________________________________________________
convolution2d_5 (Convolution2D) (None, 256, 24, 34) 295168 dropout_2[0][0]
____________________________________________________________________________________________________
convolution2d_6 (Convolution2D) (None, 256, 22, 32) 590080 convolution2d_5[0][0]
____________________________________________________________________________________________________
dropout_3 (Dropout) (None, 256, 22, 32) 0 convolution2d_6[0][0]
____________________________________________________________________________________________________
convolution2d_7 (Convolution2D) (None, 512, 20, 30) 1180160 dropout_3[0][0]
____________________________________________________________________________________________________
convolution2d_8 (Convolution2D) (None, 512, 18, 28) 2359808 convolution2d_7[0][0]
____________________________________________________________________________________________________
dropout_4 (Dropout) (None, 512, 18, 28) 0 convolution2d_8[0][0]
____________________________________________________________________________________________________
zeropadding2d_1 (ZeroPadding2D) (None, 512, 20, 30) 0 dropout_4[0][0]
____________________________________________________________________________________________________
deconvolution2d_1 (Deconvolution2(None, 512, 41, 61) 2359808 zeropadding2d_1[0][0]
____________________________________________________________________________________________________
deconvolution2d_2 (Deconvolution2(None, 512, 123, 183) 2359808 deconvolution2d_1[0][0]
____________________________________________________________________________________________________
cropping2d_1 (Cropping2D) (None, 512, 120, 160) 0 deconvolution2d_2[0][0]
____________________________________________________________________________________________________
convolution2d_9 (Convolution2D) (None, 1, 120, 160) 513 cropping2d_1[0][0]
====================================================================================================
Total params: 9474113

我无法从 512 个减少 Devolving2D 层的大小,因为这样做会导致与形状相关的错误,而且似乎我必须添加与前一层中的滤波器数量一样多的 Devolving2D 层。我还必须添加最终的 Convolution2D 层才能运行网络。

上述架构可以学习,但速度非常慢并且(我认为)效率低下。我确信我做错了什么,设计不应该是这样的。你能帮我设计一个更好的网络吗?

我还尝试建立一个网络,如 this repository 中提到的那样。但 Keras 似乎不像这个 Lasagne 示例那样工作。如果有人能向我展示如何在 Keras 中设计类似这个网络的东西,我将非常感激。它的架构是这样的:

enter image description here

谢谢

最佳答案

我建议 U-Net (见图 1)。在 U-Net 的前半部分,空间分辨率随着 channel 数量的增加而降低(如您提到的 VGG)。在后半部分,发生相反的情况( channel 数量减少,分辨率增加)。不同层之间的“跳过”连接允许网络有效地产生高分辨率输出。

您应该能够找到合适的 Keras 实现(也许 this one )。

关于machine-learning - 使用 Keras 进行深度估计,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/39685349/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com