gpt4 book ai didi

tensorflow - 不了解类 UNET 架构中的数据流,并且对 Conv2DTranspose 层的输出有问题

转载 作者:行者123 更新时间:2023-12-01 21:47:57 24 4
gpt4 key购买 nike

我对修改后的 U-Net 架构的输入维度有一两个问题。为了节省您的时间并更好地理解/重现我的结果,我将发布代码和输出尺寸。修改后的 U-Net 架构是来自 https://github.com/nibtehaz/MultiResUNet/blob/master/MultiResUNet.py 的 MultiResUNet 架构.并基于本文 https://arxiv.org/abs/1902.04049请不要被这段代码的长度关闭。您可以简单地复制粘贴它,重现我的结果的时间不会超过 10 秒。此外,您不需要数据集。使用 TF.v1.9 Keras v.2.20 进行测试。

import tensorflow as tf
from tensorflow.keras.layers import Input, Conv2D, MaxPooling2D, Conv2DTranspose, concatenate, BatchNormalization, Activation, add
from tensorflow.keras.models import Model
from tensorflow.keras.activations import relu

###{ 2D Convolutional layers

# Arguments: ######################################################################
# x {keras layer} -- input layer #
# filters {int} -- number of filters #
# num_row {int} -- number of rows in filters #
# num_col {int} -- number of columns in filters #

# Keyword Arguments:
# padding {str} -- mode of padding (default: {'same'})
# strides {tuple} -- stride of convolution operation (default: {(1, 1)})
# activation {str} -- activation function (default: {'relu'})
# name {str} -- name of the layer (default: {None})

# Returns:
# [keras layer] -- [output layer]}

# # ############################################################################


def conv2d_bn(x, filters ,num_row,num_col, padding = "same", strides = (1,1), activation = 'relu', name = None):

x = Conv2D(filters,(num_row, num_col), strides=strides, padding=padding, use_bias=False)(x)
x = BatchNormalization(axis=3, scale=False)(x)
if(activation == None):
return x
x = Activation(activation, name=name)(x)

return x

# our 2D transposed Convolution with batch normalization

# 2D Transposed Convolutional layers

# Arguments: #############################################################
# x {keras layer} -- input layer #
# filters {int} -- number of filters #
# num_row {int} -- number of rows in filters #
# num_col {int} -- number of columns in filters

# Keyword Arguments:
# padding {str} -- mode of padding (default: {'same'})
# strides {tuple} -- stride of convolution operation (default: {(2, 2)})
# name {str} -- name of the layer (default: {None})

# Returns:
# [keras layer] -- [output layer] ###################################

def trans_conv2d_bn(x, filters, num_row, num_col, padding='same', strides=(2, 2), name=None):

x = Conv2DTranspose(filters, (num_row, num_col), strides=strides, padding=padding)(x)
x = BatchNormalization(axis=3, scale=False)(x)

return x

# Our Multi-Res Block

# Arguments: ############################################################
# U {int} -- Number of filters in a corrsponding UNet stage #
# inp {keras layer} -- input layer #

# Returns: #
# [keras layer] -- [output layer] #
###################################################################

def MultiResBlock(U, inp, alpha = 1.67):

W = alpha * U

shortcut = inp

shortcut = conv2d_bn(shortcut, int(W*0.167) + int(W*0.333) +
int(W*0.5), 1, 1, activation=None, padding='same')

conv3x3 = conv2d_bn(inp, int(W*0.167), 3, 3,
activation='relu', padding='same')

conv5x5 = conv2d_bn(conv3x3, int(W*0.333), 3, 3,
activation='relu', padding='same')

conv7x7 = conv2d_bn(conv5x5, int(W*0.5), 3, 3,
activation='relu', padding='same')

out = concatenate([conv3x3, conv5x5, conv7x7], axis=3)
out = BatchNormalization(axis=3)(out)

out = add([shortcut, out])
out = Activation('relu')(out)
out = BatchNormalization(axis=3)(out)

return out

# Our ResPath:
# ResPath

# Arguments:#######################################
# filters {int} -- [description]
# length {int} -- length of ResPath
# inp {keras layer} -- input layer

# Returns:
# [keras layer] -- [output layer]#############



def ResPath(filters, length, inp):
shortcut = inp
shortcut = conv2d_bn(shortcut, filters, 1, 1,
activation=None, padding='same')

out = conv2d_bn(inp, filters, 3, 3, activation='relu', padding='same')

out = add([shortcut, out])
out = Activation('relu')(out)
out = BatchNormalization(axis=3)(out)

for i in range(length-1):

shortcut = out
shortcut = conv2d_bn(shortcut, filters, 1, 1,
activation=None, padding='same')

out = conv2d_bn(out, filters, 3, 3, activation='relu', padding='same')

out = add([shortcut, out])
out = Activation('relu')(out)
out = BatchNormalization(axis=3)(out)

return out



# MultiResUNet

# Arguments: ############################################
# height {int} -- height of image
# width {int} -- width of image
# n_channels {int} -- number of channels in image

# Returns:
# [keras model] -- MultiResUNet model###############




def MultiResUnet(height, width, n_channels):



inputs = Input((height, width, n_channels))

# downsampling part begins here

mresblock1 = MultiResBlock(32, inputs)
pool1 = MaxPooling2D(pool_size=(2, 2))(mresblock1)
mresblock1 = ResPath(32, 4, mresblock1)

mresblock2 = MultiResBlock(32*2, pool1)
pool2 = MaxPooling2D(pool_size=(2, 2))(mresblock2)
mresblock2 = ResPath(32*2, 3, mresblock2)

mresblock3 = MultiResBlock(32*4, pool2)
pool3 = MaxPooling2D(pool_size=(2, 2))(mresblock3)
mresblock3 = ResPath(32*4, 2, mresblock3)

mresblock4 = MultiResBlock(32*8, pool3)


# Upsampling part

up5 = concatenate([Conv2DTranspose(
32*4, (2, 2), strides=(2, 2), padding='same')(mresblock4), mresblock3], axis=3)
mresblock5 = MultiResBlock(32*8, up5)

up6 = concatenate([Conv2DTranspose(
32*4, (2, 2), strides=(2, 2), padding='same')(mresblock5), mresblock2], axis=3)
mresblock6 = MultiResBlock(32*4, up6)

up7 = concatenate([Conv2DTranspose(
32*2, (2, 2), strides=(2, 2), padding='same')(mresblock6), mresblock1], axis=3)
mresblock7 = MultiResBlock(32*2, up7)


conv8 = conv2d_bn(mresblock7, 1, 1, 1, activation='sigmoid')

model = Model(inputs=[inputs], outputs=[conv8])

return model

现在回到我在 UNet 架构中输入/输出维度不匹配的问题。

如果我选择过滤器高度/宽度 (128,128) 或 (256,256) 或 (512,512) 并执行:

 model = MultiResUnet(128, 128,3)
display(model.summary())

Tensorflow 为我提供了整个架构的完美结果。现在如果我这样做

     model = MultiResUnet(36, 36,3)
display(model.summary())

我收到这个错误:

--------------------------------------------------------------------------- ValueError Traceback (most recent call last) in ----> 1 model = MultiResUnet(36, 36,3) 2 display(model.summary())

in MultiResUnet(height, width, n_channels) 25 26 up5 = concatenate([Conv2DTranspose( ---> 27 32*4, (2, 2), strides=(2, 2), padding='same')(mresblock4), mresblock3], axis=3) 28 mresblock5 = MultiResBlock(32*8, up5) 29

~/miniconda3/envs/MastersThenv/lib/python3.6/site-packages/tensorflow/python/keras/layers/merge.py in concatenate(inputs, axis, **kwargs) 682 A tensor, the concatenation of the inputs alongside axis axis. 683 """ --> 684 return Concatenate(axis=axis, **kwargs)(inputs) 685 686

~/miniconda3/envs/MastersThenv/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer.py in call(self, inputs, *args, **kwargs) 694 if all(hasattr(x, 'get_shape') for x in input_list): 695 input_shapes = nest.map_structure(lambda x: x.get_shape(), inputs) --> 696 self.build(input_shapes) 697 698 # Check input assumptions set after layer building, e.g. input shape.

~/miniconda3/envs/MastersThenv/lib/python3.6/site-packages/tensorflow/python/keras/utils/tf_utils.py in wrapper(instance, input_shape) 146 else: 147 input_shape = tuple(tensor_shape.TensorShape(input_shape).as_list()) --> 148 output_shape = fn(instance, input_shape) 149 if output_shape is not None: 150 if isinstance(output_shape, list):

~/miniconda3/envs/MastersThenv/lib/python3.6/site-packages/tensorflow/python/keras/layers/merge.py in build(self, input_shape) 388 'inputs with matching shapes ' 389 'except for the concat axis. ' --> 390 'Got inputs shapes: %s' % (input_shape)) 391 392 def _merge_function(self, inputs):

ValueError: A Concatenate layer requires inputs with matching shapes except for the concat axis. Got inputs shapes: [(None, 8, 8, 128), (None, 9, 9, 128)]

为什么 Conv2DTranspose 给我错误的维度

(None, 8, 8, 128)

代替

(None, 9, 9, 128)

当我选择像 (128,128)、(256,256) 等过滤器大小(32 的倍数)时,为什么 Concat 函数不报错所以为了概括这个问题,我怎样才能让这个 UNet 架构适用于任何过滤器大小,我怎样才能处理 Conv2DTranspose 层产生的输出比实际需要的维度少一维(宽度/高度)维度(当过滤器大小不是 32 的倍数或不对称时),为什么其他大小为 32 的倍数的过滤器不会发生这种情况。如果我有可变输入大小 ??

如有任何帮助,我们将不胜感激。

干杯,H

最佳答案

U-Net 系列模型(例如上面的 MultiResUNet 模型)遵循编码器-解码器架构。 编码器 是具有特征提取的下采样路径,而解码器 是上采样路径。来自编码器的特征图在解码器处通过跳过连接连接。这些特征映射在最后一个轴上连接,即 'channel' 轴(考虑到特征具有维度 [batch_size, height, width, channels])。现在,对于要在任何轴(在我们的例子中为“ channel ”轴)连接的特征,所有其他轴上的尺寸必须匹配。

在上述模型架构中,编码器路径中执行了3 次下采样/最大池化 操作(通过 MaxPooling2D )。在解码器路径上,执行了 3 个上采样/转置转换 操作,旨在将图像恢复到完整维度。然而,为了使连接(通过跳过连接)发生,高度、宽度和 batch_size 的下采样和上采样特征维度应该在模型的每个“级别”保持相同。我将用你在问题中提到的例子来说明这一点:

第一种情况:输入维度(128,128,3):128 -> 64 -> 32 -> 16 -> 32 -> 64 -> 128

第二种情况:输入维度(36,36,3):36 -> 18 -> 9 -> 4 -> 8 -> 16 -> 32

在第二种情况下,当feature map的heightwidth在编码器路径中达到9时,进一步的降采样导致维度上采样时无法在解码器中恢复的变化(损失)。因此,由于无法连接维度为 [(None, 8, 8, 128)] & [(None, 9, 9, 128)] .

一般来说,对于具有“n”下采样 (MaxPooling2D) 层的简单编码器-解码器模型(具有跳过连接),输入维度必须是 2 的倍数^n 以便能够在解码器处连接模型的编码器特征。在这种情况下,n=3,因此输入必须是 8 的倍数,才不会遇到这些维度不匹配错误。

希望对您有所帮助! :)

关于tensorflow - 不了解类 UNET 架构中的数据流,并且对 Conv2DTranspose 层的输出有问题,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/60063797/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com