gpt4 book ai didi

python - 转换 TensorFlow 教程以处理我自己的数据

转载 作者:太空宇宙 更新时间:2023-11-04 00:39:56 25 4
gpt4 key购买 nike

这是我上一个问题的后续问题 Converting from Pandas dataframe to TensorFlow tensor object

我现在正在进行下一步,需要更多帮助。我正在尝试替换这行代码

batch = mnist.train.next_batch(100)

替换我自己的数据。我在 StackOverflow 上找到了这个答案:Where does next_batch in the TensorFlow tutorial batch_xs, batch_ys = mnist.train.next_batch(100) come from?但是我不明白:

1) 为什么 .next_batch() 对我的张量不起作用。我是否错误地创建了它

2) 如何实现 .next_batch() 问题答案中给出的伪代码

我目前有两个张量对象,一个包含我希望用来训练模型的参数 (dataVar_tensor),另一个包含正确的结果 (depth_tensor)。我显然需要保持他们的关系以使用正确的参数保持正确的响应。

请您花点时间帮助我了解发生了什么并替换这行代码?

非常感谢

最佳答案

我去掉了不相关的内容以保留格式和缩进。希望现在应该清楚了。以下代码以 N 行为一组读取 CSV 文件(N 在顶部的常量中指定)。每行包含一个日期(第一个单元格),然后是一个 float 列表(480 个单元格)和一个单热向量(3 个单元格)。然后,代码在读取这些日期、 float 和 one-hot 向量时简单地打印它们的批处理。它打印它们的地方通常是您实际运行模型并提供这些代替占位符变量的地方。

请记住,此处它将每一行读取为一个字符串,然后将该行中的特定单元格转换为 float ,这仅仅是因为第一个单元格作为字符串更容易读取。如果您的所有数据都是数字,那么只需将默认值设置为 float/int 而不是 'a' 并删除将字符串转换为 float 的代码。否则不需要!

我添加了一些注释来阐明它在做什么。如果有什么不清楚的地方,请告诉我。

import tensorflow as tf

fileName = 'YOUR_FILE.csv'

try_epochs = 1
batch_size = 3

TD = 1 # this is my date-label for each row, for internal pruposes
TS = 480 # this is the list of features, 480 in this case
TL = 3 # this is one-hot vector of 3 representing the label

# set defaults to something (TF requires defaults for the number of cells you are going to read)
rDefaults = [['a'] for row in range((TD+TS+TL))]

# function that reads the input file, line-by-line
def read_from_csv(filename_queue):
reader = tf.TextLineReader(skip_header_lines=False) # i have no header file
_, csv_row = reader.read(filename_queue) # read one line
data = tf.decode_csv(csv_row, record_defaults=rDefaults) # use defaults for this line (in case of missing data)
dateLbl = tf.slice(data, [0], [TD]) # first cell is my 'date-label' for internal pruposes
features = tf.string_to_number(tf.slice(data, [TD], [TS]), tf.float32) # cells 2-480 is the list of features
label = tf.string_to_number(tf.slice(data, [TD+TS], [TL]), tf.float32) # the remainin 3 cells is the list for one-hot label
return dateLbl, features, label

# function that packs each read line into batches of specified size
def input_pipeline(fName, batch_size, num_epochs=None):
filename_queue = tf.train.string_input_producer(
[fName],
num_epochs=num_epochs,
shuffle=True) # this refers to multiple files, not line items within files
dateLbl, features, label = read_from_csv(filename_queue)
min_after_dequeue = 10000 # min of where to start loading into memory
capacity = min_after_dequeue + 3 * batch_size # max of how much to load into memory
# this packs the above lines into a batch of size you specify:
dateLbl_batch, feature_batch, label_batch = tf.train.shuffle_batch(
[dateLbl, features, label],
batch_size=batch_size,
capacity=capacity,
min_after_dequeue=min_after_dequeue)
return dateLbl_batch, feature_batch, label_batch

# these are the date label, features, and label:
dateLbl, features, labels = input_pipeline(fileName, batch_size, try_epochs)

with tf.Session() as sess:

gInit = tf.global_variables_initializer().run()
lInit = tf.local_variables_initializer().run()

coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(coord=coord)

try:
while not coord.should_stop():
# load date-label, features, and label:
dateLbl_batch, feature_batch, label_batch = sess.run([dateLbl, features, labels])

print(dateLbl_batch);
print(feature_batch);
print(label_batch);
print('----------');

except tf.errors.OutOfRangeError:
print("Done looping through the file")

finally:
coord.request_stop()

coord.join(threads)

关于python - 转换 TensorFlow 教程以处理我自己的数据,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/42302498/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com