gpt4 book ai didi

python - 使用 tensorflow 进行线性回归

转载 作者:太空狗 更新时间:2023-10-29 22:24:57 24 4
gpt4 key购买 nike

我试图理解线性回归……这是我试图理解的脚本:

'''
A linear regression learning algorithm example using TensorFlow library.
Author: Aymeric Damien
Project: https://github.com/aymericdamien/TensorFlow-Examples/
'''

from __future__ import print_function

import tensorflow as tf
from numpy import *
import numpy
import matplotlib.pyplot as plt
rng = numpy.random

# Parameters
learning_rate = 0.0001
training_epochs = 1000
display_step = 50

# Training Data
train_X = numpy.asarray([3.3,4.4,5.5,6.71,6.93,4.168,9.779,6.182,7.59,2.167,
7.042,10.791,5.313,7.997,5.654,9.27,3.1])
train_Y = numpy.asarray([1.7,2.76,2.09,3.19,1.694,1.573,3.366,2.596,2.53,1.221,
2.827,3.465,1.65,2.904,2.42,2.94,1.3])

train_X=numpy.asarray(train_X)
train_Y=numpy.asarray(train_Y)
n_samples = train_X.shape[0]


# tf Graph Input
X = tf.placeholder("float")
Y = tf.placeholder("float")

# Set model weights
W = tf.Variable(rng.randn(), name="weight")
b = tf.Variable(rng.randn(), name="bias")

# Construct a linear model
pred = tf.add(tf.multiply(X, W), b)


# Mean squared error
cost = tf.reduce_sum(tf.pow(pred-Y, 2))/(2*n_samples)
# Gradient descent
optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost)

# Initializing the variables
init = tf.global_variables_initializer()

# Launch the graph
with tf.Session() as sess:
sess.run(init)

# Fit all training data
for epoch in range(training_epochs):
for (x, y) in zip(train_X, train_Y):
sess.run(optimizer, feed_dict={X: x, Y: y})

# Display logs per epoch step
if (epoch+1) % display_step == 0:
c = sess.run(cost, feed_dict={X: train_X, Y:train_Y})
print("Epoch:", '%04d' % (epoch+1), "cost=", "{:.9f}".format(c), \
"W=", sess.run(W), "b=", sess.run(b))

print("Optimization Finished!")
training_cost = sess.run(cost, feed_dict={X: train_X, Y: train_Y})
print("Training cost=", training_cost, "W=", sess.run(W), "b=", sess.run(b), '\n')

# Graphic display
plt.plot(train_X, train_Y, 'ro', label='Original data')
plt.plot(train_X, sess.run(W) * train_X + sess.run(b), label='Fitted line')
plt.legend()
plt.show()

问题是这部分代表什么:
# Set model weights
W = tf.Variable(rng.randn(), name="weight")
b = tf.Variable(rng.randn(), name="bias")

为什么会有随机浮点数?

你也可以给我看一些用形式表示成本、预测、优化器变量的数学吗?

最佳答案

让我们试着把一些直觉和来源与 tf 放在一起方法。

一般直觉:

这里介绍的回归是一个监督学习问题。其中,如 Russel&Norvig 的 Artificial Intelligence 中所定义,任务是:

given a training set (X, y) of m input-output pairs (x1, y1), (x2, y2), ... , (xm, ym), where each output was generated by an unknown function y = f(x), discover a function h that approximates the true function f



为此, h假设函数以某种方式组合每个 x加上待学习的参数,才能得到尽可能接近对应 y的输出尽可能,这适用于整个数据集。希望得到的函数将接近 f .

但是如何学习这个参数呢? in order to be able to learn, the model has to be able to evaluate .成本(也称为损失、能量、值(value)...)函数来了:它是一个 metric function比较 h 的输出与相应的 y , 和 惩罚大的差异 .

现在应该清楚这里的“学习”过程到底是什么: 更改参数以实现成本函数 的较低值.

线性回归:

您发布的示例执行 参数线性回归 , 优化 梯度下降基于 均方误差 作为成本函数。意思是:
  • 参数 : 参数集是固定的。在整个学习过程中,它们都保存在完全相同的内存占位符中。
  • 线性 : h的输出只是输入 x 之间的线性(实际上是仿射)组合和你的参数。所以如果 xw是相同维度的实值向量,并且 b是实数,它成立 h(x,w, b)= w.transposed()*x+b . Deep Learning Book的第107页带来更多高质量的见解和直觉。
  • 成本函数 : 现在这是有趣的部分。平均平方误差为 凸面功能。这意味着它有一个单一的、全局最优的,而且可以直接用 的集合找到它。正规方程 (也在 DLB 中进行了解释)。在您的示例中,使用随机(和/或小批量)梯度下降方法:这是优化非凸成本函数(神经网络等更高级模型中的情况)或数据集时的首选方法具有巨大的维度(也在 DLB 中进行了解释)。
  • 梯度下降 :tf为您处理这个问题,因此可以说 GD 通过以小步“向下”跟随其导数来最小化成本函数,直到达到鞍点。如果您完全需要知道,TF 应用的确切技术称为 automatic differentiation ,这是数字方法和符号方法之间的一种折衷。对于像你这样的凸函数,这个点将是全局最优,并且(如果你的学习率不是太大)它总是会收敛到它,所以 初始化变量的值并不重要.在神经网络等更复杂的架构中,随机初始化是必要的。有一些关于小批量管理的额外代码,但我不会深入讨论,因为它不是您问题的主要焦点。

  • TensorFlow 方法:

    深度学习框架现在通过构建计算图来嵌套大量函数(你可能想看看我几周前做的 presentation on DL frameworks)。为了构建和运行图形,TensoFlow 遵循 declarative style ,这意味着图必须首先完全定义和编译,然后才能部署和执行。非常推荐阅读 this简短的维基文章,如果你还没有。在这种情况下,设置分为两部分:
  • 首先,你定义你的计算 Graph ,您将数据集和参数放在内存占位符中,定义基于它们的假设和成本函数,然后告诉 tf应用哪种优化技术。
  • 然后在 Session 中运行计算并且库将能够(重新)加载数据占位符并执行优化。

  • 编码:

    该示例的代码密切遵循这种方法:
  • 定义测试数据X和标签 Y ,并在图表中为他们准备一个占位符(在 feed_dict 部分提供)。
  • 为参数定义“W”和“b”占位符。他们必须是 Variables因为它们将在 session 期间更新。
  • 定义 pred (我们的假设)和 cost如前所述。


  • 由此看来,其余的代码应该更清楚了。关于优化器,正如我所说, tf已经知道如何处理这个问题,但您可能想查看梯度下降以了解更多细节(同样,DLB 是一个很好的引用)

    干杯!
    安德烈斯

    代码示例:梯度下降 VS。正规方程

    这个小片段生成简单的多维数据集并测试这两种方法。请注意,正规方程方法不需要循环,并带来更好的结果。对于小维度(DIMENSIONS<30k)可能是首选方法:
    from __future__ import absolute_import, division, print_function
    import numpy as np
    import tensorflow as tf

    ####################################################################################################
    ### GLOBALS
    ####################################################################################################
    DIMENSIONS = 5
    f = lambda(x): sum(x) # the "true" function: f = 0 + 1*x1 + 1*x2 + 1*x3 ...
    noise = lambda: np.random.normal(0,10) # some noise

    ####################################################################################################
    ### GRADIENT DESCENT APPROACH
    ####################################################################################################
    # dataset globals
    DS_SIZE = 5000
    TRAIN_RATIO = 0.6 # 60% of the dataset is used for training
    _train_size = int(DS_SIZE*TRAIN_RATIO)
    _test_size = DS_SIZE - _train_size
    ALPHA = 1e-8 # learning rate
    LAMBDA = 0.5 # L2 regularization factor
    TRAINING_STEPS = 1000

    # generate the dataset, the labels and split into train/test
    ds = [[np.random.rand()*1000 for d in range(DIMENSIONS)] for _ in range(DS_SIZE)] # synthesize data
    # ds = normalize_data(ds)
    ds = [(x, [f(x)+noise()]) for x in ds] # add labels
    np.random.shuffle(ds)
    train_data, train_labels = zip(*ds[0:_train_size])
    test_data, test_labels = zip(*ds[_train_size:])

    # define the computational graph
    graph = tf.Graph()
    with graph.as_default():
    # declare graph inputs
    x_train = tf.placeholder(tf.float32, shape=(_train_size, DIMENSIONS))
    y_train = tf.placeholder(tf.float32, shape=(_train_size, 1))
    x_test = tf.placeholder(tf.float32, shape=(_test_size, DIMENSIONS))
    y_test = tf.placeholder(tf.float32, shape=(_test_size, 1))
    theta = tf.Variable([[0.0] for _ in range(DIMENSIONS)])
    theta_0 = tf.Variable([[0.0]]) # don't forget the bias term!
    # forward propagation
    train_prediction = tf.matmul(x_train, theta)+theta_0
    test_prediction = tf.matmul(x_test, theta) +theta_0
    # cost function and optimizer
    train_cost = (tf.nn.l2_loss(train_prediction - y_train)+LAMBDA*tf.nn.l2_loss(theta))/float(_train_size)
    optimizer = tf.train.GradientDescentOptimizer(ALPHA).minimize(train_cost)
    # test results
    test_cost = (tf.nn.l2_loss(test_prediction - y_test)+LAMBDA*tf.nn.l2_loss(theta))/float(_test_size)

    # run the computation
    with tf.Session(graph=graph) as s:
    tf.initialize_all_variables().run()
    print("initialized"); print(theta.eval())
    for step in range(TRAINING_STEPS):
    _, train_c, test_c = s.run([optimizer, train_cost, test_cost],
    feed_dict={x_train: train_data, y_train: train_labels,
    x_test: test_data, y_test: test_labels })
    if (step%100==0):
    # it should return bias close to zero and parameters all close to 1 (see definition of f)
    print("\nAfter", step, "iterations:")
    #print(" Bias =", theta_0.eval(), ", Weights = ", theta.eval())
    print(" train cost =", train_c); print(" test cost =", test_c)
    PARAMETERS_GRADDESC = tf.concat(0, [theta_0, theta]).eval()
    print("Solution for parameters:\n", PARAMETERS_GRADDESC)

    ####################################################################################################
    ### NORMAL EQUATIONS APPROACH
    ####################################################################################################
    # dataset globals
    DIMENSIONS = 5
    DS_SIZE = 5000
    TRAIN_RATIO = 0.6 # 60% of the dataset isused for training
    _train_size = int(DS_SIZE*TRAIN_RATIO)
    _test_size = DS_SIZE - _train_size
    f = lambda(x): sum(x) # the "true" function: f = 0 + 1*x1 + 1*x2 + 1*x3 ...
    noise = lambda: np.random.normal(0,10) # some noise
    # training globals
    LAMBDA = 1e6 # L2 regularization factor

    # generate the dataset, the labels and split into train/test
    ds = [[np.random.rand()*1000 for d in range(DIMENSIONS)] for _ in range(DS_SIZE)]
    ds = [([1]+x, [f(x)+noise()]) for x in ds] # add x[0]=1 dimension and labels
    np.random.shuffle(ds)
    train_data, train_labels = zip(*ds[0:_train_size])
    test_data, test_labels = zip(*ds[_train_size:])

    # define the computational graph
    graph = tf.Graph()
    with graph.as_default():
    # declare graph inputs
    x_train = tf.placeholder(tf.float32, shape=(_train_size, DIMENSIONS+1))
    y_train = tf.placeholder(tf.float32, shape=(_train_size, 1))
    theta = tf.Variable([[0.0] for _ in range(DIMENSIONS+1)]) # implicit bias!
    # optimum
    optimum = tf.matrix_solve_ls(x_train, y_train, LAMBDA, fast=True)

    # run the computation: no loop needed!
    with tf.Session(graph=graph) as s:
    tf.initialize_all_variables().run()
    print("initialized")
    opt = s.run(optimum, feed_dict={x_train:train_data, y_train:train_labels})
    PARAMETERS_NORMEQ = opt
    print("Solution for parameters:\n",PARAMETERS_NORMEQ)

    ####################################################################################################
    ### PREDICTION AND ERROR RATE
    ####################################################################################################

    # generate test dataset
    ds = [[np.random.rand()*1000 for d in range(DIMENSIONS)] for _ in range(DS_SIZE)]
    ds = [([1]+x, [f(x)+noise()]) for x in ds] # add x[0]=1 dimension and labels
    test_data, test_labels = zip(*ds)
    # define hypothesis
    h_gd = lambda(x): PARAMETERS_GRADDESC.T.dot(x)
    h_ne = lambda(x): PARAMETERS_NORMEQ.T.dot(x)
    # define cost
    mse = lambda pred, lab: ((pred-np.array(lab))**2).sum()/DS_SIZE
    # make predictions!
    predictions_gd = np.array([h_gd(x) for x in test_data])
    predictions_ne = np.array([h_ne(x) for x in test_data])
    # calculate and print total error
    cost_gd = mse(predictions_gd, test_labels)
    cost_ne = mse(predictions_ne, test_labels)
    print("total cost with gradient descent:", cost_gd)
    print("total cost with normal equations:", cost_ne)

    关于python - 使用 tensorflow 进行线性回归,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/43170017/

    24 4 0
    Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
    广告合作:1813099741@qq.com 6ren.com