gpt4 book ai didi

python - TFRecords 比原始大小大 100 倍

转载 作者:行者123 更新时间:2023-12-01 00:20:29 26 4
gpt4 key购买 nike

我正在使用 StyleGAN github 存储库中的 dataset_tool.py 将本地文件夹中的训练图像转换为 TFRecords。这是代码:

 def create_from_images(tfrecord_dir, image_dir, shuffle):
print('Loading images from "%s"' % image_dir)
image_filenames = sorted(glob.glob(os.path.join(image_dir, '*')))
if len(image_filenames) == 0:
error('No input images found')

img = np.asarray(PIL.Image.open(image_filenames[0]))
resolution = img.shape[0]
channels = img.shape[2] if img.ndim == 3 else 1
if img.shape[1] != resolution:
error('Input images must have the same width and height')
if resolution != 2 ** int(np.floor(np.log2(resolution))):
error('Input image resolution must be a power-of-two')
if channels not in [1, 3]:
error('Input images must be stored as RGB or grayscale')

with TFRecordExporter(tfrecord_dir, len(image_filenames)) as tfr:
order = tfr.choose_shuffled_order() if shuffle else np.arange(len(image_filenames))
for idx in range(order.size):
img = np.asarray(PIL.Image.open(image_filenames[order[idx]]))
if channels == 1:
img = img[np.newaxis, :, :] # HW => CHW
else:
img = img.transpose([2, 0, 1]) # HWC => CHW
tfr.add_image(img)

def add_image(self, img):
if self.print_progress and self.cur_images % self.progress_interval == 0:
print('%d / %d\r' % (self.cur_images, self.expected_images), end='', flush=True)
if self.shape is None:
self.shape = img.shape
self.resolution_log2 = int(np.log2(self.shape[1]))
assert self.shape[0] in [1, 3]
assert self.shape[1] == self.shape[2]
assert self.shape[1] == 2**self.resolution_log2
tfr_opt = tf.python_io.TFRecordOptions(tf.python_io.TFRecordCompressionType.NONE)
for lod in range(self.resolution_log2 - 1):
tfr_file = self.tfr_prefix + '-r%02d.tfrecords' % (self.resolution_log2 - lod)
self.tfr_writers.append(tf.python_io.TFRecordWriter(tfr_file, tfr_opt))
assert img.shape == self.shape
for lod, tfr_writer in enumerate(self.tfr_writers):
if lod:
img = img.astype(np.float32)
img = (img[:, 0::2, 0::2] + img[:, 0::2, 1::2] + img[:, 1::2, 0::2] + img[:, 1::2, 1::2]) * 0.25
quant = np.rint(img).clip(0, 255).astype(np.uint8)
ex = tf.train.Example(features=tf.train.Features(feature={
'shape': tf.train.Feature(int64_list=tf.train.Int64List(value=quant.shape)),
'data': tf.train.Feature(bytes_list=tf.train.BytesList(value=[quant.tostring()]))}))
tfr_writer.write(ex.SerializeToString())
self.cur_images += 1

它创建具有多种分辨率(最高可达原始分辨率)的 TFRecords 文件。因此,使用原始结果创建的 TFRecords 比包含文件的原始文件夹大 100 倍。我的原始文件是 BW png,每个 2 KB,文件夹大小 120 MB。虽然我收到的 TFRecords 是 12 GB。我知道 TFRecords 通常比原始文件大,但不是 100 倍!这里可能有什么问题?

最佳答案

问题在于您将未压缩的图像保存在记录文件中,这比压缩的图像文件占用更多的空间。为了避免这种情况,您可以直接将图像文件写入为记录,但是,由于您首先要进行一些图像处理,因此您必须进行该处理并再次以压缩格式保存生成的图像。您可以使用如下函数将图像数组转换为其 PNG 压缩形式:

import io
import numpy as np
from PIL import Image

def img2png(image):
# Assumes image was passed in CHW format
img = Image.fromarray(np.moveaxis(image, 0, 2))
with io.BytesIO() as img_bytes:
img.save(img_bytes, 'PNG')
return img_bytes.getvalue()

在您的示例中,您可以像这样保存quant图像。

ex = tf.train.Example(features=tf.train.Features(feature={
'shape': tf.train.Feature(int64_list=tf.train.Int64List(value=quant.shape)),
'data': tf.train.Feature(bytes_list=tf.train.BytesList(value=[img2png(quant)]))}))

请注意,由于您要保存压缩图像,因此需要使用 tf.io.decode_image稍后解析记录时。这是您必须为减少磁盘大小而“付出”的开销。

关于python - TFRecords 比原始大小大 100 倍,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/58998822/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com