gpt4 book ai didi

python - 为什么我的线性回归得到的是 nan 值而不是学习?

转载 作者:太空宇宙 更新时间:2023-11-03 14:10:06 31 4
gpt4 key购买 nike

我正在运行以下代码:

import tensorflow as tf

# data set
x_data = [10., 20., 30., 40.]
y_data = [20., 40., 60., 80.]

# try to find values for w and b that compute y_data = W * x_data + b
# range is -100 ~ 100
W = tf.Variable(tf.random_uniform([1], -1000., 1000.))
b = tf.Variable(tf.random_uniform([1], -1000., 1000.))

X = tf.placeholder(tf.float32)
Y = tf.placeholder(tf.float32)

# my hypothesis
hypothesis = W * X + b

# Simplified cost function
cost = tf.reduce_mean(tf.square(hypothesis - Y))

# minimize
a = tf.Variable(0.1) # learning rate, alpha
optimizer = tf.train.GradientDescentOptimizer(a)
train = optimizer.minimize(cost) # goal is minimize cost

# before starting, initialize the variables
init = tf.initialize_all_variables()

# launch
sess = tf.Session()
sess.run(init)

# fit the line
for step in xrange(2001):
sess.run(train, feed_dict={X: x_data, Y: y_data})
if step % 100 == 0:
print step, sess.run(cost, feed_dict={X: x_data, Y: y_data}), sess.run(W), sess.run(b)

print sess.run(hypothesis, feed_dict={X: 5})
print sess.run(hypothesis, feed_dict={X: 2.5})

结果如下

0 1.60368e+10 [ 4612.54003906] [ 406.81304932]
100 nan [ nan] [ nan]
200 nan [ nan] [ nan]
300 nan [ nan] [ nan]
400 nan [ nan] [ nan]
500 nan [ nan] [ nan]
600 nan [ nan] [ nan]
700 nan [ nan] [ nan]
800 nan [ nan] [ nan]
900 nan [ nan] [ nan]
1000 nan [ nan] [ nan]
1100 nan [ nan] [ nan]
1200 nan [ nan] [ nan]
1300 nan [ nan] [ nan]
1400 nan [ nan] [ nan]
1500 nan [ nan] [ nan]
1600 nan [ nan] [ nan]
1700 nan [ nan] [ nan]
1800 nan [ nan] [ nan]
1900 nan [ nan] [ nan]
2000 nan [ nan] [ nan]
[ nan]
[ nan]

我不明白为什么这个结果是nan

如果我把初始数据改成这个

x_data = [1., 2., 3., 4.]
y_data = [2., 4., 6., 8.]

然后就没问题了。这是为什么?

最佳答案

你正在溢出 float32,因为学习率对于你的问题来说太高了,而不是收敛权重变量 (W) 在梯度下降的每一步都朝着越来越大的幅度振荡。

如果你改变

a = tf.Variable(0.1)

a = tf.Variable(0.001)

权重应该收敛得更好。您可能也想增加迭代次数(到 ~ 50000)。

选择合适的学习率通常是实现或使用机器学习算法时面临的第一个挑战。获得增加的损失值而不是收敛到最小值通常是学习率过高的标志。

在您的情况下,当您在训练数据中使用更大的量级时,拟合线的特定问题更容易受到发散权重的影响。这就是为什么在训练之前对数据进行归一化是很常见的原因之一,例如。神经网络。

此外,您的起始权重和偏差的范围非常大,这意味着它们可能与理想值相去甚远,并且在开始时具有非常大的损失值和梯度。为初始值选择一个合适的范围是您在研究更高级的学习算法时要正确处理的另一件重要事情。

关于python - 为什么我的线性回归得到的是 nan 值而不是学习?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/39314946/

31 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com