TFRecord格式存储数据与队列读取实例-6ren

TFRecord格式存储数据与队列读取实例

转载作者：qq735679552 更新时间：2022-09-29 22:32:09

CFSDN坚持开源创造价值，我们致力于搭建一个资源共享平台，让每一个IT人在这里找到属于你的精彩世界.

这篇CFSDN的博客文章TFRecord格式存储数据与队列读取实例由作者收集整理，如果你对这篇文章有兴趣，记得点赞哟.

Tensor Flow官方网站上提供三种读取数据的方法。

1. 预加载数据：在Tensor Flow图中定义常量或变量来保存所有数据,将数据直接嵌到数据图中，当训练数据较大时，很消耗内存.

如。

 
    ? 
   
         x1 
         = 
         tf.constant([ 
         0 
         , 
         1 
         ]) 
        
         x2 
         = 
         tf.constant([ 
         1 
         , 
         0 
         ]) 
        
         y 
         = 
         tf.add(x1,x2)

2.填充数据：使用sess.run()的feed_dict参数，将Python产生的数据填充到后端，之前的MNIST数据集就是通过这种方法。也有消耗内存，数据类型转换耗时的缺点.

3. 从文件读取数据：从文件中直接读取，让队列管理器从文件中读取数据。分为两步。

先把样本数据写入TFRecords二进制文件。

再从队列中读取。

TFRecord是TensorFlow提供的一种统一存储数据的二进制文件，能更好的利用内存，更方便的复制和移动，并且不需要单独的标记文件。下面通过代码来将MNIST转换成TFRecord的数据格式，其他数据集也类似.

 
    ? 
   
         #生成整数型的属性 
        
         def 
         _int64_feature(value): 
        
         return 
         tf.train.Feature(int64_list 
         = 
         tf.train.Int64List(value 
         = 
         [value])) 
        
         #生成字符串型的属性 
        
         def 
         _bytes_feature(value): 
        
         return 
         tf.train.Feature(bytes_list 
         = 
         tf.train.BytesList(value 
         = 
         [value])) 
        
         def 
         convert_to(data_set,name): 
        
         ''' 
        
         将数据填入到tf.train.Example的协议缓冲区（protocol buffer)中，将协议缓冲区序列 
        
         化为一个字符串，通过tf.python_io.TFRecordWriter写入TFRecords文件  
        
         ''' 
        
         images 
         = 
         data_set.images 
        
         labels 
         = 
         data_set.labels 
        
         num_examples 
         = 
         data_set.num_examples 
        
         if 
         images.shape[ 
         0 
         ]! 
         = 
         num_examples: 
        
         raise 
         ValueError ( 
         'Imagessize %d does not match label size %d.' 
         \ 
        
         % 
         (images.shape[ 
         0 
         ],num_examples)) 
        
         rows 
         = 
         images.shape[ 
         1 
         ]  
         #28 
        
         cols 
         = 
         images.shape[ 
         2 
         ]  
         #28 
        
         depth 
         = 
         images.shape[ 
         3 
         ]  
         #1 是黑白图像 
        
         filename  
         = 
         os.path.join(FLAGS.directory, name  
         + 
         '.tfrecords' 
         ) 
        
         #使用下面语句就会将三个文件存储为一个TFRecord文件,当数据量较大时，最好将数据写入多个文件 
        
         #filename="C:/Users/dbsdz/Desktop/TF练习/TFRecord" 
        
         print 
         ( 
         'Writing' 
         ,filename) 
        
         writer 
         = 
         tf.python_io.TFRecordWriter(filename) 
        
         for 
         index  
         in 
         range 
         (num_examples): 
        
         image_raw 
         = 
         images[index].tostring()  
         #将图像矩阵化为一个字符串 
        
         #写入协议缓冲区，height、width、depth、label编码成int 64类型，image——raw编码成二进制 
        
         example 
         = 
         tf.train.Example(features 
         = 
         tf.train.Features(feature 
         = 
         { 
        
         'height' 
         :_int64_feature(rows), 
        
         'width' 
         :_int64_feature(cols), 
        
         'depth' 
         :_int64_feature(depth), 
        
         'label' 
         :_int64_feature( 
         int 
         (labels[index])), 
        
         'image_raw' 
         :_bytes_feature(image_raw)})) 
        
         writer.write(example.SerializeToString())   
         #序列化字符串 
        
         writer.close()

上面程序可以将MNIST数据集中所有的训练数据存储到三个TFRecord文件中。结果如下图。

TFRecord格式存储数据与队列读取实例

从队列中TFRecord文件，过程分三步。

1. 创建张量，从二进制文件中读取一个样本。

2. 创建张量，从二进制文件中随机读取一个mini-batch 。

3. 把每一批张量传入网络作为输入节点。

具体代码如下。

 
    ? 
   
         def 
         read_and_decode(filename_queue):   
         #输入文件名队列 
        
         reader 
         = 
         tf.TFRecordReader() 
        
         _,serialized_example 
         = 
         reader.read(filename_queue) 
        
         #解析一个example,如果需要解析多个样例，使用parse_example函数 
        
         features 
         = 
         tf.parse_single_example(  
        
         serialized_example, 
        
         #必须写明feature里面的key的名称 
        
         features 
         = 
         { 
        
         #TensorFlow提供两种不同的属性解析方法，一种方法是tf.FixedLenFeature,   
        
         #这种方法解析的结果为一个Tensor。另一个方法是tf.VarLenFeature, 
        
         #这种方法得到的解析结果为SparseTensor,用于处理稀疏数据。 
        
         #这里解析数据的格式需要和上面程序写入数据的格式一致 
        
         'image_raw' 
         :tf.FixedLenFeature([],tf.string), 
         #图片是string类型 
        
         'label' 
         :tf.FixedLenFeature([],tf.int64),  
         #标记是int64类型 
        
         }) 
        
         #对于BytesList,要重新进行编码，把string类型的0维Tensor变成uint8类型的一维Tensor 
        
         image  
         = 
         tf.decode_raw(features[ 
         'image_raw' 
         ], tf.uint8) 
        
         image.set_shape([IMAGE_PIXELS]) 
        
         #tensor("input/DecodeRaw:0",shape=(784,),dtype=uint8) 
        
         #image张量的形状为：tensor("input/sub:0",shape=(784,),dtype=float32) 
        
         image  
         = 
         tf.cast(image, tf.float32)  
         * 
         ( 
         1. 
         / 
         255 
         )  
         - 
         0.5 
        
         #把标记从uint8类型转换为int32类性 
        
         #label张量的形状为tensor（“input/cast_1:0",shape=(),dtype=int32) 
        
         label  
         = 
         tf.cast(features[ 
         'label' 
         ], tf.int32) 
        
         return 
         image,label 
        
         def 
         inputs(train,batch_size,num_epochs): 
        
         #输入参数： 
        
         #train：选择输入训练数据/验证数据 
        
         #batch_size:训练的每一批有多少个样本 
        
         #num_epochs:过几遍数据，设置为0/None表示永远训练下去 
        
         ''' 
        
         返回结果： A tuple (images,labels) 
        
         *images:类型为float，形状为【batch_size,mnist.IMAGE_PIXELS],范围【-0.5，0.5】。 
        
         *label:类型为int32，形状为【batch_size],范围【0，mnist.NUM_CLASSES] 
        
         注意tf.train.QueueRunner必须用tf.train.start_queue_runners()来启动线程 
        
         ''' 
        
         if 
         not 
         num_epochs:num_epochs 
         = 
         None 
        
         #获取文件路径，即./MNIST_data/train.tfrecords,./MNIST_data/validation.records 
        
         filename 
         = 
         os.path.join(FLAGS.train_dir,TRAIN_FILE  
         if 
         train  
         else 
         VALIDATION_FILE) 
        
         with tf.name_scope( 
         'input' 
         ): 
        
         #tf.train.string_input_producer返回一个QueueRunner,里面有一个FIFOQueue 
        
         filename_queue 
         = 
         tf.train.string_input_producer( 
         #如果样本量很大，可以分成若干文件，把文件名列表传入 
        
         [filename],num_epochs 
         = 
         num_epochs)   
        
         image,label 
         = 
         read_and_decode(filename_queue) 
        
         #随机化example,并把它们整合成batch_size大小 
        
         #tf.train.shuffle_batch生成了RandomShuffleQueue,并开启两个线程 
        
         images,sparse_labels 
         = 
         tf.train.shuffle_batch( 
        
         [image,label],batch_size 
         = 
         batch_size,num_threads 
         = 
         2 
         , 
        
         capacity 
         = 
         1000 
         + 
         3 
         * 
         batch_size, 
        
         min_after_dequeue 
         = 
         1000 
         )  
         #留下一部分队列，来保证每次有足够的数据做随机打乱 
        
         return 
         images,sparse_labels

最后，构建一个三层的神经网络，包含两层卷积层以及一层使用SoftMax层，附上完整代码如下。

 
    ? 
   
         # -*- coding: utf-8 -*- 
        
         """ 
        
         Created on Sun Apr 8 11:06:16 2018 
        
         @author: dbsdz 
        
         https://blog.csdn.net/xy2953396112/article/details/54929073 
        
         """ 
        
         import 
         tensorflow as tf 
        
         import 
         os 
        
         import 
         time 
        
         import 
         math 
        
         from 
         tensorflow.examples.tutorials.mnist  
         import 
         input_data 
        
         mnist  
         = 
         input_data.read_data_sets( 
         "MNIST_data/" 
         , one_hot 
         = 
         True 
         ) 
        
         # Basic model parameters as external flags.  
        
         flags  
         = 
         tf.app.flags  
        
         flags.DEFINE_float( 
         'learning_rate' 
         ,  
         0.01 
         ,  
         'Initial learning rate.' 
         )  
        
         flags.DEFINE_integer( 
         'hidden1' 
         ,  
         128 
         ,  
         'Number of units in hidden layer 1.' 
         )  
        
         flags.DEFINE_integer( 
         'hidden2' 
         ,  
         32 
         ,  
         'Number of units in hidden layer 2.' 
         )  
        
         flags.DEFINE_integer( 
         'batch_size' 
         ,  
         100 
         ,  
         'Batch size. ' 
        
         'Must divide evenly into the dataset sizes.' 
         )  
        
         flags.DEFINE_string( 
         'train_dir' 
         ,  
         'Mnist_data/' 
         ,  
         'Directory to put the training data.' 
         )  
        
         flags.DEFINE_string( 
         'directory' 
         ,  
         './MNIST_data' 
         , 
        
         'Directory to download data files and write the ' 
        
         'converted result' 
         ) 
        
         flags.DEFINE_integer( 
         'validation_size' 
         ,  
         5000 
         , 
        
         'Number of examples to separate from the training ' 
        
         'data for the validation set.' 
         ) 
        
         flags.DEFINE_integer( 
         'num_epochs' 
         , 
         10 
         , 
         'num_epochs set' 
         ) 
        
         FLAGS  
         = 
         tf.app.flags.FLAGS 
        
         IMAGE_SIZE  
         = 
         28 
        
         IMAGE_PIXELS  
         = 
         IMAGE_SIZE  
         * 
         IMAGE_SIZE   
         #图片像素728 
        
         TRAIN_FILE  
         = 
         "train.tfrecords" 
        
         VALIDATION_FILE 
         = 
         "validation.tfrecords" 
        
         #生成整数型的属性 
        
         def 
         _int64_feature(value): 
        
         return 
         tf.train.Feature(int64_list 
         = 
         tf.train.Int64List(value 
         = 
         [value])) 
        
         #生成字符串型的属性 
        
         def 
         _bytes_feature(value): 
        
         return 
         tf.train.Feature(bytes_list 
         = 
         tf.train.BytesList(value 
         = 
         [value])) 
        
         def 
         convert_to(data_set,name): 
        
         ''' 
        
         将数据填入到tf.train.Example的协议缓冲区（protocol buffer)中，将协议缓冲区序列 
        
         化为一个字符串，通过tf.python_io.TFRecordWriter写入TFRecords文件  
        
         ''' 
        
         images 
         = 
         data_set.images 
        
         labels 
         = 
         data_set.labels 
        
         num_examples 
         = 
         data_set.num_examples 
        
         if 
         images.shape[ 
         0 
         ]! 
         = 
         num_examples: 
        
         raise 
         ValueError ( 
         'Imagessize %d does not match label size %d.' 
         \ 
        
         % 
         (images.shape[ 
         0 
         ],num_examples)) 
        
         rows 
         = 
         images.shape[ 
         1 
         ]  
         #28 
        
         cols 
         = 
         images.shape[ 
         2 
         ]  
         #28 
        
         depth 
         = 
         images.shape[ 
         3 
         ]  
         #1 是黑白图像 
        
         filename  
         = 
         os.path.join(FLAGS.directory, name  
         + 
         '.tfrecords' 
         ) 
        
         #使用下面语句就会将三个文件存储为一个TFRecord文件,当数据量较大时，最好将数据写入多个文件 
        
         #filename="C:/Users/dbsdz/Desktop/TF练习/TFRecord" 
        
         print 
         ( 
         'Writing' 
         ,filename) 
        
         writer 
         = 
         tf.python_io.TFRecordWriter(filename) 
        
         for 
         index  
         in 
         range 
         (num_examples): 
        
         image_raw 
         = 
         images[index].tostring()  
         #将图像矩阵化为一个字符串 
        
         #写入协议缓冲区，height、width、depth、label编码成int 64类型，image——raw编码成二进制 
        
         example 
         = 
         tf.train.Example(features 
         = 
         tf.train.Features(feature 
         = 
         { 
        
         'height' 
         :_int64_feature(rows), 
        
         'width' 
         :_int64_feature(cols), 
        
         'depth' 
         :_int64_feature(depth), 
        
         'label' 
         :_int64_feature( 
         int 
         (labels[index])), 
        
         'image_raw' 
         :_bytes_feature(image_raw)})) 
        
         writer.write(example.SerializeToString())   
         #序列化字符串 
        
         writer.close() 
        
         def 
         inference(images, hidden1_units, hidden2_units): 
        
         with tf.name_scope( 
         'hidden1' 
         ): 
        
         weights  
         = 
         tf.Variable( 
        
         tf.truncated_normal([IMAGE_PIXELS, hidden1_units], 
        
         stddev 
         = 
         1.0 
         / 
         math.sqrt( 
         float 
         (IMAGE_PIXELS))),name 
         = 
         'weights' 
         ) 
        
         biases  
         = 
         tf.Variable(tf.zeros([hidden1_units]),name 
         = 
         'biases' 
         ) 
        
         hidden1  
         = 
         tf.nn.relu(tf.matmul(images, weights)  
         + 
         biases) 
        
         with tf.name_scope( 
         'hidden2' 
         ): 
        
         weights  
         = 
         tf.Variable( 
        
         tf.truncated_normal([hidden1_units, hidden2_units], 
        
         stddev 
         = 
         1.0 
         / 
         math.sqrt( 
         float 
         (hidden1_units))), 
        
         name 
         = 
         'weights' 
         ) 
        
         biases  
         = 
         tf.Variable(tf.zeros([hidden2_units]), 
        
         name 
         = 
         'biases' 
         ) 
        
         hidden2  
         = 
         tf.nn.relu(tf.matmul(hidden1, weights)  
         + 
         biases) 
        
         with tf.name_scope( 
         'softmax_linear' 
         ): 
        
         weights  
         = 
         tf.Variable( 
        
         tf.truncated_normal([hidden2_units,FLAGS.num_epochs], 
        
         stddev 
         = 
         1.0 
         / 
         math.sqrt( 
         float 
         (hidden2_units))),name 
         = 
         'weights' 
         ) 
        
         biases  
         = 
         tf.Variable(tf.zeros([FLAGS.num_epochs]),name 
         = 
         'biases' 
         ) 
        
         logits  
         = 
         tf.matmul(hidden2, weights)  
         + 
         biases 
        
         return 
         logits 
        
         def 
         lossFunction(logits, labels): 
        
         labels  
         = 
         tf.to_int64(labels) 
        
         cross_entropy  
         = 
         tf.nn.sparse_softmax_cross_entropy_with_logits( 
        
         logits 
         = 
         logits, labels 
         = 
         labels, name 
         = 
         'xentropy' 
         ) 
        
         loss  
         = 
         tf.reduce_mean(cross_entropy, name 
         = 
         'xentropy_mean' 
         ) 
        
         return 
         loss 
        
         def 
         training(loss, learning_rate): 
        
         tf.summary.scalar(loss.op.name, loss) 
        
         optimizer  
         = 
         tf.train.GradientDescentOptimizer(learning_rate) 
        
         global_step  
         = 
         tf.Variable( 
         0 
         , name 
         = 
         'global_step' 
         , trainable 
         = 
         False 
         ) 
        
         train_op  
         = 
         optimizer.minimize(loss, global_step 
         = 
         global_step) 
        
         return 
         train_op 
        
         def 
         read_and_decode(filename_queue):   
         #输入文件名队列 
        
         reader 
         = 
         tf.TFRecordReader() 
        
         _,serialized_example 
         = 
         reader.read(filename_queue) 
        
         #解析一个example,如果需要解析多个样例，使用parse_example函数 
        
         features 
         = 
         tf.parse_single_example(  
        
         serialized_example, 
        
         #必须写明feature里面的key的名称 
        
         features 
         = 
         { 
        
         #TensorFlow提供两种不同的属性解析方法，一种方法是tf.FixedLenFeature,   
        
         #这种方法解析的结果为一个Tensor。另一个方法是tf.VarLenFeature, 
        
         #这种方法得到的解析结果为SparseTensor,用于处理稀疏数据。 
        
         #这里解析数据的格式需要和上面程序写入数据的格式一致 
        
         'image_raw' 
         :tf.FixedLenFeature([],tf.string), 
         #图片是string类型 
        
         'label' 
         :tf.FixedLenFeature([],tf.int64),  
         #标记是int64类型 
        
         }) 
        
         #对于BytesList,要重新进行编码，把string类型的0维Tensor变成uint8类型的一维Tensor 
        
         image  
         = 
         tf.decode_raw(features[ 
         'image_raw' 
         ], tf.uint8) 
        
         image.set_shape([IMAGE_PIXELS]) 
        
         #tensor("input/DecodeRaw:0",shape=(784,),dtype=uint8) 
        
         #image张量的形状为：tensor("input/sub:0",shape=(784,),dtype=float32) 
        
         image  
         = 
         tf.cast(image, tf.float32)  
         * 
         ( 
         1. 
         / 
         255 
         )  
         - 
         0.5 
        
         #把标记从uint8类型转换为int32类性 
        
         #label张量的形状为tensor（“input/cast_1:0",shape=(),dtype=int32) 
        
         label  
         = 
         tf.cast(features[ 
         'label' 
         ], tf.int32) 
        
         return 
         image,label 
        
         def 
         inputs(train,batch_size,num_epochs): 
        
         #输入参数： 
        
         #train：选择输入训练数据/验证数据 
        
         #batch_size:训练的每一批有多少个样本 
        
         #num_epochs:过几遍数据，设置为0/None表示永远训练下去 
        
         ''' 
        
         返回结果： A tuple (images,labels) 
        
         *images:类型为float，形状为【batch_size,mnist.IMAGE_PIXELS],范围【-0.5，0.5】。 
        
         *label:类型为int32，形状为【batch_size],范围【0，mnist.NUM_CLASSES] 
        
         注意tf.train.QueueRunner必须用tf.train.start_queue_runners()来启动线程 
        
         ''' 
        
         if 
         not 
         num_epochs:num_epochs 
         = 
         None 
        
         #获取文件路径，即./MNIST_data/train.tfrecords,./MNIST_data/validation.records 
        
         filename 
         = 
         os.path.join(FLAGS.train_dir,TRAIN_FILE  
         if 
         train  
         else 
         VALIDATION_FILE) 
        
         with tf.name_scope( 
         'input' 
         ): 
        
         #tf.train.string_input_producer返回一个QueueRunner,里面有一个FIFOQueue 
        
         filename_queue 
         = 
         tf.train.string_input_producer( 
         #如果样本量很大，可以分成若干文件，把文件名列表传入 
        
         [filename],num_epochs 
         = 
         num_epochs)   
        
         image,label 
         = 
         read_and_decode(filename_queue) 
        
         #随机化example,并把它们整合成batch_size大小 
        
         #tf.train.shuffle_batch生成了RandomShuffleQueue,并开启两个线程 
        
         images,sparse_labels 
         = 
         tf.train.shuffle_batch( 
        
         [image,label],batch_size 
         = 
         batch_size,num_threads 
         = 
         2 
         , 
        
         capacity 
         = 
         1000 
         + 
         3 
         * 
         batch_size, 
        
         min_after_dequeue 
         = 
         1000 
         )  
         #留下一部分队列，来保证每次有足够的数据做随机打乱 
        
         return 
         images,sparse_labels 
        
         def 
         run_training(): 
        
         with tf.Graph().as_default(): 
        
         #输入images和labels 
        
         images,labels 
         = 
         inputs(train 
         = 
         True 
         ,batch_size 
         = 
         FLAGS.batch_size, 
        
         num_epochs 
         = 
         3 
         )   
         #num_epochs就是训练的轮数  
        
         #构建一个从推理模型来预测数据的图 
        
         logits 
         = 
         inference(images,FLAGS.hidden1,FLAGS.hidden2) 
        
         loss 
         = 
         lossFunction(logits,labels)  
         #定义损失函数 
        
         #Add to the Graph operations that train the model 
        
         train_op 
         = 
         training(loss,FLAGS.learning_rate) 
        
         #初始化参数，特别注意：string——input_producer内部创建了一个epoch计数变量 
        
         #归入tf.graphkey.local_variables集合中，必须单独用initialize_local_variables()初始化 
        
         init_op 
         = 
         tf.group(tf.global_variables_initializer(), 
        
         tf.local_variables_initializer()) 
        
         sess 
         = 
         tf.Session() 
        
         sess.run(init_op) 
        
         #Start input enqueue threads 
        
         coord  
         = 
         tf.train.Coordinator() 
        
         threads 
         = 
         tf.train.start_queue_runners(sess 
         = 
         sess,coord 
         = 
         coord) 
        
         try 
         : 
        
         step 
         = 
         0 
        
         while 
         not 
         coord.should_stop():  
         #进入永久循环 
        
         start_time 
         = 
         time.time() 
        
         _,loss_value 
         = 
         sess.run([train_op,loss]) 
        
         #每100次训练输出一次结果 
        
         if 
         step  
         % 
         100 
         = 
         = 
         0 
         : 
        
         duration 
         = 
         time.time() 
         - 
         start_time 
        
         print 
         ( 
         'Step %d: loss=%.2f (%.3f sec)' 
         % 
         (step,loss_value,duration)) 
        
         step 
         + 
         = 
         1 
        
         except 
         tf.errors.OutOfRangeError: 
        
         print 
         ( 
         'Done training for %d epochs,%d steps.' 
         % 
         (FLAGS.num_epochs,step)) 
        
         finally 
         : 
        
         coord.request_stop() 
         #通知其他线程关闭 
        
         coord.join(threads) 
        
         sess.close() 
        
         def 
         main(unused_argv): 
        
         #获取数据 
        
         data_sets 
         = 
         input_data.read_data_sets(FLAGS.directory,dtype 
         = 
         tf.uint8,reshape 
         = 
         False 
         , 
        
         validation_size 
         = 
         FLAGS.validation_size) 
        
         #将数据转换成tf.train.Example类型，并写入TFRecords文件 
        
         convert_to(data_sets.train, 
         'train' 
         ) 
        
         convert_to(data_sets.validation, 
         'validation' 
         ) 
        
         convert_to(data_sets.test, 
         'test' 
         ) 
        
         print 
         ( 
         'convert finished' 
         ) 
        
         run_training() 
        
         if 
         __name__  
         = 
         = 
         '__main__' 
         : 
        
         tf.app.run()

运行结果如图。

TFRecord格式存储数据与队列读取实例

以上这篇TFRecord格式存储数据与队列读取实例就是小编分享给大家的全部内容了，希望能给大家一个参考，也希望大家多多支持我.

原文链接：https://blog.csdn.net/dbsdzxq/article/details/79872465 。

最后此篇关于TFRecord格式存储数据与队列读取实例的文章就讲到这里了,如果你想了解更多关于TFRecord格式存储数据与队列读取实例的内容请搜索CFSDN的文章或继续浏览相关文章，希望大家以后支持我的博客！。

文章推荐：使用 tf.nn.dynamic_rnn 展开时间维度方式

文章推荐： TensorFlow dataset.shuffle、batch、repeat的使用详解

文章推荐： Python实现FLV视频拼接功能

文章推荐： tensorflow estimator 使用hook实现finetune方式

python - 将 .tfrecords 文件拆分为多个 .tfrecords 文件
有什么方法可以直接将 .tfrecords 文件拆分为多个 .tfrecords 文件，而无需写回每个数据集示例？最佳答案在 tensorflow 2.0.0 中，这将起作用: import te
tensorflow - 序列化张量并从图中写入 tfrecord
我想从 AutoGraph 生成的图形内部将 tensorflow 示例记录写入 TFRecordWriter。 tensorflow 2.0 的文档说明如下: The simplest way to
tensorflow - 为对象检测任务创建 tfrecord
我正在使用 tensorflow 对象检测 api 创建用于微调任务的数据集。我的目录结构是: 火车/ -- 图片/ ---- img1.jpg -- 安/ ---- img1.csv 其中每个图像
python - 如何将字符串数据保存到 TFRecord？
保存到 TFRecord 时，我使用: def _int64_feature(value): return tf.train.Feature(int64_list=tf.train.Int64
tensorflow - TFRecords 和记录改组
我的理解是，最好对每个时期的训练样本进行洗牌，以便每个小批量包含整个数据集的一个很好的随机样本。如果我将整个数据集转换为包含 TFRecords 的单个文件，那么在加载整个数据集的情况下如何实现这种改
tensorflow - TFRecords 文件的分片需要什么？
为什么在 TensorFlow 的初始模型示例中对 TFRecords 文件进行分片？为了随机性，不能在创建一个 TFRecord 文件之前打乱文件列表吗？最佳答案为什么 TFRecords 文
tensorflow - tfrecord 文件的最佳大小
根据您的经验，在各种设备(硬盘、SSD、NVME)和存储位置(本地计算机、具有网络安装的 HPC 集群)上运行效果最好的 .tfrecord 文件的理想大小是多少？如果我在云中技术更强大的计算机上获
tensorflow - TFRecords 文件的分片需要什么？
为什么在 TensorFlow 的初始模型示例中对 TFRecords 文件进行分片？为了随机性，不能在创建一个 TFRecord 文件之前打乱文件列表吗？最佳答案为什么 TFRecords 文
python - 如何批量写入 TFRecords？
我有一个包含大约 4000 万行的 CSV。每行都是一个训练实例。根据 the documentation on consuming TFRecords我正在尝试对数据进行编码并将其保存在 TFRec
python - 如何可视化 TFRecord？
我在另一个论坛上被问到这个问题，但我想我会把它发布在这里，以供遇到 TFRecords 问题的任何人使用。如果 TFRecord 文件中的标签与 labels.pbtxt 文件中的标签不对齐，Ten
python - 如何创建多个 TFRecord 文件而不是制作一个大文件然后将其拆分？
我正在处理相当大的时间序列数据集，然后将准备为 SequenceExample 的数据写入 TFRecord 。这会产生一个相当大的文件(超过 100GB)，但我想将它存储在块中。我试过了: file
tensorflow - 将图像/掩码对转换为 tfrecord
关于 Carvana Image Masking Challenge 给出的数据格式，我们如何将其转换为tfrecord可以输入到 Deeplab V3 中的格式型号，可支持VOC和 Cityscap
Tensorflow:加载未知的 TFRecord 数据集
我得到了一个 TFRecord 数据文件 filename = train-00000-of-00001，其中包含未知大小的图像，可能还包含其他信息。我知道我可以使用 dataset = tf.dat
mxnet - TFRecord 与 RecordIO
TensorFlow 对象检测 API 更喜欢 TFRecord 文件格式。 MXNet 和 Amazon Sagemaker 似乎使用 RecordIO 格式。这两种二进制文件格式有何不同，或者它们
tensorflow - 混洗 tfrecords 文件
我有 5 个 tfrecords 文件，每个对象一个。在训练时，我想从所有 5 个 tfrecord 中平均读取数据，即如果我的批量大小为 50，我应该从第一个 tfrecord 文件中获取 10 个
tensorflow - 用图像和多标签编写 tfrecords 进行分类
我想用 TensorFlow 执行多标签分类。我有大约 95000 张图像，每张图像都有一个相应的标签向量。每个图像有 7 个标签。这 7 个标签表示为一个大小为 7 的张量。每个图像的形状为 (2
tensorflow - 对具有不同图像大小的数据集使用 tensorflow TFRecords
在 TensorFlow 教程示例中，TFRecords 的用法与 MNIST 数据集一起提供。 MNIST 数据集被转换为 TFRecords 文件，如下所示: def convert_to(dat
tensorflow - 写入和读取列表到 TFRecord 示例
我想将整数列表(或任何多维 numpy 矩阵)写入一个 TFRecords 示例。对于单个值或多个值的列表，我可以创建 TFRecord 文件而不会出错。我还知道如何从 TFRecord 文件中读取单
multithreading - 使用多线程编写 tfrecord 并不像预期的那样快
尝试编写 w/和 w/o 多线程的 tfrecord，发现速度差异不大(w/4 线程:434 秒；w/o 多线程 590 秒)。不确定我是否正确使用它。有没有更好的方法来更快地编写 tfrecord？
tensorflow - 使用存储在Google Cloud中的Training TFRecords
我的目标是在本地运行Tensorflow Training App时使用存储在Google Cloud存储中的培训数据（格式：tfrecords）。（为什么要在本地？：在将其转换为Cloud ML培

qq735679552

个人简介

我是一名优秀的程序员,十分优秀！

作者热门文章

滴滴打车优惠券免费领取

全站热门文章

首页

博学

6Ren·AI

商城

TFRecord格式存储数据与队列读取实例