gpt4 book ai didi

python - 如何为 Keras/tf.Keras 构建自定义数据生成器,其中 X 图像被增强并且对应的 Y 标签也是图像

转载 作者:行者123 更新时间:2023-12-04 15:18:37 25 4
gpt4 key购买 nike

我正在使用 UNet 进行图像二值化,并且有一个包含 150 个图像及其二值化版本的数据集。我的想法是随机增加图像,使它们看起来不同,所以我做了一个函数,可以将 4-5 种类型的噪声、偏度、剪切等中的任何一种插入到图像中。我可以很容易地使用ImageDataGenerator(preprocess_function=my_aug_function)增加图像但问题是我的 y 目标 也是一个形象。另外,我可以使用类似的东西:

train_dataset = (
train_dataset.map(
encode_single_sample, num_parallel_calls=tf.data.experimental.AUTOTUNE
)
.batch(batch_size)
.prefetch(buffer_size=tf.data.experimental.AUTOTUNE)
)
但它有两个问题:
  • 对于更大的数据集,它会炸毁内存,因为数据需要已经在内存中
  • 这是我需要在旅途中增强图像以使其看起来像我有一个庞大的数据集的关键部分。

  • 另一种解决方案可能是将增强图像保存到一个目录中并使其大小为 30-40K,然后加载它们。这样做会很愚蠢。
    现在的想法是我可以使用 Sequence作为父类,但如何使用相应的 Y 二值化图像不断增加和生成新图像?
    我有一个想法,如下面的代码。有人可以帮助我增强和生成 y 图像。我有我的 X_DIR, Y_DIR其中二进制和原始图像名称相同但存储在不同目录中。
    class DataGenerator(tensorflow.keras.utils.Sequence):
    def __init__(self, files_path, labels_path, batch_size=32, shuffle=True, random_state=42):
    'Initialization'
    self.files = files_path
    self.labels = labels_path
    self.batch_size = batch_size
    self.shuffle = shuffle
    self.random_state = random_state
    self.on_epoch_end()


    def on_epoch_end(self):
    'Updates indexes after each epoch'
    # Shuffle the data here


    def __len__(self):
    return int(np.floor(len(self.files) / self.batch_size))

    def __getitem__(self, index):
    # What do I do here?


    def __data_generation(self, files):
    # I think this is responsible for Augmentation but no idea how should I implement it and how does it works.

    最佳答案

    自定义图像数据生成器
    将目录数据加载到 CustomDataGenerator 的数据框中

    def data_to_df(data_dir, subset=None, validation_split=None):
    df = pd.DataFrame()
    filenames = []
    labels = []

    for dataset in os.listdir(data_dir):
    img_list = os.listdir(os.path.join(data_dir, dataset))
    label = name_to_idx[dataset]

    for image in img_list:
    filenames.append(os.path.join(data_dir, dataset, image))
    labels.append(label)

    df["filenames"] = filenames
    df["labels"] = labels

    if subset == "train":
    split_indexes = int(len(df) * validation_split)
    train_df = df[split_indexes:]
    val_df = df[:split_indexes]
    return train_df, val_df

    return df

    train_df, val_df = data_to_df(train_dir, subset="train", validation_split=0.2)
    自定义数据生成器

    import tensorflow as tf
    from PIL import Image
    import numpy as np

    class CustomDataGenerator(tf.keras.utils.Sequence):

    ''' Custom DataGenerator to load img

    Arguments:
    data_frame = pandas data frame in filenames and labels format
    batch_size = divide data in batches
    shuffle = shuffle data before loading
    img_shape = image shape in (h, w, d) format
    augmentation = data augmentation to make model rebust to overfitting

    Output:
    Img: numpy array of image
    label : output label for image
    '''

    def __init__(self, data_frame, batch_size=10, img_shape=None, augmentation=True, num_classes=None):
    self.data_frame = data_frame
    self.train_len = len(data_frame)
    self.batch_size = batch_size
    self.img_shape = img_shape
    self.num_classes = num_classes
    print(f"Found {self.data_frame.shape[0]} images belonging to {self.num_classes} classes")

    def __len__(self):
    ''' return total number of batches '''
    self.data_frame = shuffle(self.data_frame)
    return math.ceil(self.train_len/self.batch_size)

    def on_epoch_end(self):
    ''' shuffle data after every epoch '''
    # fix on epoch end it's not working, adding shuffle in len for alternative
    pass

    def __data_augmentation(self, img):
    ''' function for apply some data augmentation '''
    img = tf.keras.preprocessing.image.random_shift(img, 0.2, 0.3)
    img = tf.image.random_flip_left_right(img)
    img = tf.image.random_flip_up_down(img)
    return img

    def __get_image(self, file_id):
    """ open image with file_id path and apply data augmentation """
    img = np.asarray(Image.open(file_id))
    img = np.resize(img, self.img_shape)
    img = self.__data_augmentation(img)
    img = preprocess_input(img)

    return img

    def __get_label(self, label_id):
    """ uncomment the below line to convert label into categorical format """
    #label_id = tf.keras.utils.to_categorical(label_id, num_classes)
    return label_id

    def __getitem__(self, idx):
    batch_x = self.data_frame["filenames"][idx * self.batch_size:(idx + 1) * self.batch_size]
    batch_y = self.data_frame["labels"][idx * self.batch_size:(idx + 1) * self.batch_size]
    # read your data here using the batch lists, batch_x and batch_y
    x = [self.__get_image(file_id) for file_id in batch_x]
    y = [self.__get_label(label_id) for label_id in batch_y]

    return tf.convert_to_tensor(x), tf.convert_to_tensor(y)

    关于python - 如何为 Keras/tf.Keras 构建自定义数据生成器,其中 X 图像被增强并且对应的 Y 标签也是图像,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/63827339/

    25 4 0
    Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
    广告合作:1813099741@qq.com 6ren.com