gpt4 book ai didi

python - 第一个纪元后的神经网络生成 NaN 值作为输出、损失

转载 作者:行者123 更新时间:2023-12-01 01:02:11 30 4
gpt4 key购买 nike

我正在尝试设置很少层的神经网络,这将解决简单的回归问题,这应该是f(x) = 0,1x 或 f(x) = 10x

所有代码如下所示(数据生成和神经网络)

  • 4 个带有 ReLu 的全连接层
  • 损失函数 RMSE
  • 学习梯度下降

问题是在我运行它之后,输出和损失函数变成了 NaN 值:

  • 纪元:0,优化器:无,损失:inf
  • 纪元:1,优化器:无,损失:nan

输出层:[NaN,NaN,NaN,......,NaN]

我是 tensorflow 新手,我不确定我可能做错了什么(下一批、学习、 session 实现不好实现)

import tensorflow as tf
import sys
import numpy

#prepraring input data -> X
learningTestData = numpy.arange(1427456).reshape(1394,1024)

#preparing output data -> f(X) =0.1X
outputData = numpy.arange(1427456).reshape(1394,1024)

xx = outputData.shape
dd = 0
while dd < xx[0]:
jj = 0
while jj < xx[1]:
outputData[dd,jj] = outputData[dd,jj] / 10
jj += 1
dd += 1

#preparing the NN
x = tf.placeholder(tf.float32, shape=[None, 1024])
y = tf.placeholder(tf.float32, shape=[None, 1024])

full1 = tf.contrib.layers.fully_connected(inputs=x, num_outputs=1024, activation_fn=tf.nn.relu)
full1 = tf.layers.batch_normalization(full1)

full2 = tf.contrib.layers.fully_connected(inputs=full1, num_outputs=5000, activation_fn=tf.nn.relu)
full2 = tf.layers.batch_normalization(full2)

full3 = tf.contrib.layers.fully_connected(inputs=full2, num_outputs=2500, activation_fn=tf.nn.relu)
full3 = tf.layers.batch_normalization(full3)

full4 = tf.contrib.layers.fully_connected(inputs=full3, num_outputs=1024, activation_fn=tf.nn.relu)
full4 = tf.layers.batch_normalization(full4)


out = tf.contrib.layers.fully_connected(inputs=full4, num_outputs=1024, activation_fn=None)


epochs = 20
batch_size = 50
learning_rate = 0.001
batchOffset = 0

# Loss (RMSE) and Optimizer
cost = tf.losses.mean_squared_error(labels=y, predictions=out)
optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate).minimize(cost)


with tf.Session() as sess:
# Initializing the variables
sess.run(tf.global_variables_initializer())

e = 0

while e < epochs:

#selecting next batch
sb = batchOffset
eb = batchOffset+batch_size
x_batch = learningTestData[sb:eb, :]
y_batch = outputData[sb:eb, :]

#learn
opt = sess.run(optimizer,feed_dict={x: x_batch, y: y_batch})
#show RMSE
c = sess.run(cost, feed_dict={x: x_batch, y: y_batch})
print("epoch: {}, optimizer: {}, loss: {}".format(e, opt, c))

batchOffset += batch_size
e += 1

最佳答案

您需要对数据进行标准化,因为您的梯度以及由此产生的成本正在呈爆炸式增长。尝试运行此代码:

learning_rate = 0.00000001
x_batch = learningTestData[:10]
y_batch = outputData[:10]
with tf.Session() as sess:
# Initializing the variables
sess.run(tf.global_variables_initializer())
opt = sess.run(optimizer,feed_dict={x: x_batch, y: y_batch})

c = sess.run(cost, feed_dict={x: x_batch, y: y_batch})
print(c) # 531492.3

在这种情况下,您将获得有限值,因为梯度尚未将成本带到无穷大。使用标准化数据、降低学习率或减小批量大小以使其发挥作用。

关于python - 第一个纪元后的神经网络生成 NaN 值作为输出、损失,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/55696971/

30 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com