gpt4 book ai didi

python - 制作自己的 MNIST 数据集(与 MNIST 格式相同)

转载 作者:太空狗 更新时间:2023-10-30 01:11:55 25 4
gpt4 key购买 nike

我正在尝试创建我自己的 MNIST 数据版本。我已将训练和测试数据转换为以下文件;

test-images-idx3-ubyte.gz
test-labels-idx1-ubyte.gz
train-images-idx3-ubyte.gz
train-labels-idx1-ubyte.gz

(对于任何感兴趣的人,我使用 JPG-PNG-to-MNIST-NN-Format 完成了此操作,这似乎让我接近了我的目标。)

然而,这与 MNIST 数据 (mnist.pkl.gz) 的文件类型和格式并不完全相同。我知道 pkl 表示数据已被 pickle ,但我不太了解 pickle 数据的过程 - pickle 是否有特定的顺序?有人可以提供我应该用来 pickle 我的数据的代码吗?

最佳答案

import gzip
import os

import numpy as np
import six
from six.moves.urllib import request

parent = 'http://yann.lecun.com/exdb/mnist'
train_images = 'train-images-idx3-ubyte.gz'
train_labels = 'train-labels-idx1-ubyte.gz'
test_images = 't10k-images-idx3-ubyte.gz'
test_labels = 't10k-labels-idx1-ubyte.gz'
num_train = 17010
num_test = 3010
dim = 32*32


def load_mnist(images, labels, num):
data = np.zeros(num * dim, dtype=np.uint8).reshape((num, dim))
target = np.zeros(num, dtype=np.uint8).reshape((num, ))

with gzip.open(images, 'rb') as f_images,\
gzip.open(labels, 'rb') as f_labels:
f_images.read(16)
f_labels.read(8)
for i in six.moves.range(num):
target[i] = ord(f_labels.read(1))
for j in six.moves.range(dim):
data[i, j] = ord(f_images.read(1))

return data, target


def download_mnist_data():

print('Converting training data...')
data_train, target_train = load_mnist(train_images, train_labels,
num_train)
print('Done')
print('Converting test data...')
data_test, target_test = load_mnist(test_images, test_labels, num_test)
mnist = {}
mnist['data'] = np.append(data_train, data_test, axis=0)
mnist['target'] = np.append(target_train, target_test, axis=0)

print('Done')
print('Save output...')
with open('mnist.pkl', 'wb') as output:
six.moves.cPickle.dump(mnist, output, -1)
print('Done')
print('Convert completed')


def load_mnist_data():
if not os.path.exists('mnist.pkl'):
download_mnist_data()
with open('mnist.pkl', 'rb') as mnist_pickle:
mnist = six.moves.cPickle.load(mnist_pickle)
return mnist
download_mnist_data()

关于python - 制作自己的 MNIST 数据集(与 MNIST 格式相同),我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/46555025/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com