matlab - 下面梯度下降算法的迭代实现误差是多少？-6ren

matlab - 下面梯度下降算法的迭代实现误差是多少？

转载作者：行者123 更新时间：2023-11-30 09:45:59

24

4

我尝试实现梯度下降算法的迭代版本，但无法正常工作。然而，相同算法的矢量化实现可以正常工作。
这是迭代实现:

function [theta] = gradientDescent_i(X, y, theta, alpha, iterations)

    % get the number of rows and columns
    nrows = size(X, 1);
    ncols = size(X, 2);

    % initialize the hypothesis vector
    h = zeros(nrows, 1);

    % initialize the temporary theta vector
    theta_temp = zeros(ncols, 1);

    % run gradient descent for the specified number of iterations
    count = 1;

    while count <= iterations

        % calculate the hypothesis values and fill into the vector
        for i = 1 : nrows
            for j = 1 : ncols
                term = theta(j) * X(i, j);
                h(i) = h(i) + term;
            end
        end

        % calculate the gradient
        for j = 1 : ncols
            for i = 1 : nrows
                term = (h(i) - y(i)) * X(i, j);
                theta_temp(j) = theta_temp(j) + term;
            end
        end

        % update the gradient with the factor
        fact = alpha / nrows;

        for i = 1 : ncols
            theta_temp(i) = fact * theta_temp(i);
        end

        % update the theta
        for i = 1 : ncols
            theta(i) = theta(i) - theta_temp(i);
        end

        % update the count
        count += 1;
    end
end

下面是相同算法的矢量化实现:

function [theta, theta_all, J_cost] = gradientDescent(X, y, theta, alpha)

    % set the learning rate
    learn_rate = alpha;

    % set the number of iterations
    n = 1500;

    % number of training examples
    m = length(y);

    % initialize the theta_new vector
    l = length(theta);
    theta_new = zeros(l,1);

    % initialize the cost vector
    J_cost = zeros(n,1);

    % initialize the vector to store all the calculated theta values
    theta_all = zeros(n,2);

    % perform gradient descent for the specified number of iterations
    for i = 1 : n

        % calculate the hypothesis
        hypothesis = X * theta;

        % calculate the error
        err = hypothesis - y;

        % calculate the gradient
        grad = X' * err;

        % calculate the new theta
        theta_new = (learn_rate/m) .* grad;

        % update the old theta
        theta = theta - theta_new;

        % update the cost
        J_cost(i) = computeCost(X, y, theta);

        % store the calculated theta value
        if i < n
            index = i + 1;
            theta_all(index,:) = theta';
    end
end

可以找到数据集的链接 here

文件名是 ex1data1.txt

问题

对于初始 theta = [0, 0](这是一个向量!)，学习率为 0.01 并运行 1500 次迭代，我得到的最佳 theta 为:

theta0 = -3.6303
theta1 = 1.1664

上面是矢量化实现的输出，我知道我已经正确实现了它(它通过了 Coursera 上的所有测试用例)。

但是，当我使用迭代方法(我提到的第一个代码)实现相同的算法时，我得到的 theta 值是(alpha = 0.01，迭代 = 1500):

theta0 = -0.20720
theta1 = -0.77392

此实现未能通过测试用例，因此我知道该实现是不正确的。

但是，我无法理解我哪里出错了，因为迭代代码执行相同的工作，与矢量化代码执行相同的乘法，并且当我尝试跟踪两个代码的 1 次迭代的输出时，值是相同的(在笔和纸上!)但是当我在 Octave 上运行它们时失败了。

与此相关的任何帮助都会有很大帮助，特别是如果您能指出我错在哪里以及失败的确切原因是什么。

需要考虑的要点

根据我的测试，假设的实现是正确的，并且两个代码给出了相同的结果，因此这里没有问题。
我在两个代码中打印了梯度向量的输出，并意识到错误就在这里，因为这里的输出非常不同!

此外，这里是预处理数据的代码:

function[X, y] = fileReader(filename)

    % load the dataset
    dataset = load(filename);

    % get the dimensions of the dataset
    nrows = size(dataset, 1);
    ncols = size(dataset, 2);

    % generate the X matrix from the dataset
    X = dataset(:, 1 : ncols - 1);

    % generate the y vector
    y = dataset(:, ncols);

    % append 1's to the X matrix
    X = [ones(nrows, 1), X];
end

最佳答案

第一个代码的问题在于 theta_temp 和 h 向量没有正确初始化。对于第一次迭代(当 count 值等于 1 时)，您的代码运行正常，因为对于该特定迭代，h 和 theta_temp 向量具有已正确初始化为 0。然而，由于这些是梯度下降每次迭代的临时向量，因此在后续迭代中它们不会再次初始化为 0 向量。也就是说，对于迭代 2，修改为 h(i) 和 theta_temp(i) 的值只是添加到旧值中。因此，该代码无法正常工作。您需要在每次迭代开始时将向量更新为零向量，然后它们才能正常工作。这是我对您的代码的实现(第一个，观察更改):

function [theta] = gradientDescent_i(X, y, theta, alpha, iterations)

    % get the number of rows and columns
    nrows = size(X, 1);
    ncols = size(X, 2);

    % run gradient descent for the specified number of iterations
    count = 1;

    while count <= iterations

        % initialize the hypothesis vector
        h = zeros(nrows, 1);

        % initialize the temporary theta vector
        theta_temp = zeros(ncols, 1);


        % calculate the hypothesis values and fill into the vector
        for i = 1 : nrows
            for j = 1 : ncols
                term = theta(j) * X(i, j);
                h(i) = h(i) + term;
            end
        end

        % calculate the gradient
        for j = 1 : ncols
            for i = 1 : nrows
                term = (h(i) - y(i)) * X(i, j);
                theta_temp(j) = theta_temp(j) + term;
            end
        end

        % update the gradient with the factor
        fact = alpha / nrows;

        for i = 1 : ncols
            theta_temp(i) = fact * theta_temp(i);
        end

        % update the theta
        for i = 1 : ncols
            theta(i) = theta(i) - theta_temp(i);
        end

        % update the count
        count += 1;
    end
end

我运行了代码，它给出了与您提到的相同的 theta 值。然而，我想知道的是你是如何声明假设向量的输出在两种情况下是相同的，显然，这是第一个代码失败的原因之一!

关于matlab - 下面梯度下降算法的迭代实现误差是多少？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/52470153/

24

4

0

文章推荐： javascript - 使用 Javascript 使用 data-id 标签添加属性 COLSPAN

文章推荐： javascript - 如何在 AngularJS 中有条件地设置属性

文章推荐： javascript - 对从数组中获取索引感到困惑

文章推荐： java - openGL/JOGL : Why do some Textures render "Smoothed" and some not?

PHP循环通过GPX来计算轨道的总上升/下降
我想循环遍历 gpx 文件并计算总上升和下降。我有一个函数可以计算两组经纬度点之间的高程差异，我已经设置了 simplexml 来读取和循环遍历 gpx 文件 trkseg 点。问题是，这不准确(实
javascript - 比较两个玩家分数数组，看看谁在列表中上升/下降
我有两个在不同时间段拍摄的数组。如何通过将新玩家标记为上升来检查哪些玩家在列表中上升/下降？附言- 数组已经根据分数排序。 pastData:[ { playerName:'Jo
vb6 - 无论我的表单上的哪个控件是目标，我如何捕捉关键的上升/下降？
我想捕获 ctrl/alt/etc 键的起伏，无论表单上的哪个控件获取 keyup 或 keydown 事件。由于我的表单上有大约 100 个控件，如果我要为每个单独的控件添加代码，那将非常难看。我怎
r - R:如何检查向量是否正在上升/下降
vector1 = c(2, 2, 2, 2, 2, 2) vector2 = c(2, 2, 3, 3, 3, 3) vector3 = c(2, 2, 1, 2, 2, 2) 我想知道向量中的数字
rust - 借入时暂时值(value)下降
我不知道如何遵循编译器的建议:consider using a let binding to create a longer lived value。 Playground #![allow(unus
javascript - AngularJS $scope 下降
我希望有人能帮助我理解 AngularJS 中的 $scope 遇到的一个恼人的问题。请参阅下面我的代码中的注释: app.controller('MyController', function ($
elasticsearch - 几个月后 Elasticsearch 下降
我有一个 flex 搜索集群，其中有2个节点在2核CPU 8GB ram实例上运行。每个节点都传入了参数“ES_JAVA_OPTS = -Xms3g -Xmx3g”。我有4个索引，每个索引有2个分片和
r - 从局部最小值/最大值计算累积增长/下降
我正在学习 R(及其通过 quantmod lib 在交易任务中的应用)并定期浏览社区以从这里获得许多新知识和技巧。我对 R 的总体印象和特别是 quantmod lib 的印象 - 它很棒。在这一
ios - 一些绘制周期后 FPS 下降
当我们点击屏幕时，我正在绘制纹理正方形。我正在使用相同的纹理。在新 ios 设备中点击几次后，FPS 从 120 下降到 4 左右。每次手指点击时，我都会将点击的点以及纹理和纹理的大小传递给着色器。
java - 为什么我的 FPS 下降？
只有当对象被点击并且需要从列表中移除时它才会掉落。这是代码: if(event.type == TouchEvent.TOUCH_DOWN){ for(Bottle bottl
ios - SKLabelNodes 下降 fps
我有一个基于SpriteKit的小游戏。在这个游戏中，我使用了很多带有字母(或字母组合)的节点，用户可以四处移动来构建单词。这些节点基本上是带有 SKLabelNode 的 SKSpriteNod
css - 为什么 float 下降？
我有一个简单的CSS布局 wrapper header left-sidebar / main-content / right-sidebar footer 但我的主要内容似乎下降了(float dr
html - 在浏览器重新调整大小时，div 下降
在标题中，我给出了四个不同的部分，并使用 float 属性使所有内容都显示在一条水平线上。当我调整浏览器窗口大小时，最后一个 div 位于黑色边框线下方。如何解决。 http://jsfiddle
javascript - 调整页面大小时 div 下降
CSS: .desc{ text-align: center; color:#60A8D5; padding-top: 17px;
html - float Div 下降
这是一段简单的代码，但我为这个问题尝试过的解决方案都没有奏效。 #ONE { float: left; border: 1
ios - 使用重力使 SCNNode 下降？
我有一个 SceneKit 设置，其中有一个 Sphere 设置为 Dynamic body。我能够运行该应用程序并看到球体落在静态 body 地板上。我想做的是设置场景，这样 sfere 最初就
javascript - 未使用的属性(property)下降？
首先，我的类(class): export class FooBar { ... isFavorite: boolean = false; constructor() { this.isF
linux - 如何使所有传出的 RST 下降
我正在尝试删除所有端口上的所有传出 RST 和传入 RST。我正在使用 Debian Linux。我尝试了互联网上列出的所有可能的命令组合，但似乎没有任何效果。例如，我试过: iptables -A
rust - 借用时临时值(value)下降，但我不想租借
我正在做这样的事情: fn main() { //[1, 0, 0, 0, 99]; // return [2, 0, 0, 0, 99] //[2, 3, 0, 3, 99]; //
rust - if else 借入时临时值(value)下降
我正在使用 Rusqlite，它可以让你做这样的查询: statement.query_row(params!([1, 2, 3]), ...); params!()定义如下: macro_rules

首页

博学

6Ren·AI

商城

matlab - 下面梯度下降算法的迭代实现误差是多少？