gpt4 book ai didi

matlab - 多变量梯度下降Matlab - 两个代码有什么区别?

转载 作者:行者123 更新时间:2023-11-30 09:46:29 24 4
gpt4 key购买 nike

以下函数使用梯度下降找到回归线的最佳“theta”。输入 (X,y) 附加在下面。我的问题是代码1和代码2有什么区别?为什么代码 2 可以工作,而代码 1 却不能工作?

提前致谢!

GRADIENTDESCENTMULTI 执行梯度下降来学习 theta,它通过使用学习率 alpha 进行 num_iters 梯度步长来更新 theta

function [theta, J_history] = gradientDescentMulti(X, y, theta, alpha, num_iters)
% Initialize some useful values
m = length(y); % number of training examples
n = length(theta);
J_history = zeros(num_iters, 1);
costs = zeros(n,1);

for iter = 1:num_iters
% code 1 - doesn't work
for c = 1:n
for i = 1:m
costs(c) = costs(c)+(X(i,:)*theta - y(i))*X(i,c);
end
end

% code 2 - does work
E = X * theta - y;
for c = 1:n
costs(c) = sum(E.*X(:,c));
end

% update each theta
for c = 1:n
theta(c) = theta(c) - alpha*costs(c)/m;
end
J_history(iter) = computeCostMulti(X, y, theta);
end
end

function J = computeCostMulti(X, y, theta)

for i=1:m
J = J+(X(i,:)*theta - y(i))^2;
end
J = J/(2*m);

运行代码:

alpha = 0.01;
num_iters = 200;

% Init Theta and Run Gradient Descent
theta = zeros(3, 1);
[theta, J_history] = gradientDescentMulti(X, y, theta, alpha, num_iters);

% Plot the convergence graph
figure;
plot(1:numel(J_history), J_history, '-b', 'LineWidth', 2);
xlabel('Number of iterations');
ylabel('Cost J');

% Display gradient descent's result
fprintf('Theta computed from gradient descent: \n');
fprintf(' %f \n', theta);
fprintf('\n');

X 是

1.0000    0.1300   -0.2237
1.0000 -0.5042 -0.2237
1.0000 0.5025 -0.2237
1.0000 -0.7357 -1.5378
1.0000 1.2575 1.0904
1.0000 -0.0197 1.0904
1.0000 -0.5872 -0.2237
1.0000 -0.7219 -0.2237
1.0000 -0.7810 -0.2237
1.0000 -0.6376 -0.2237
1.0000 -0.0764 1.0904
1.0000 -0.0009 -0.2237
1.0000 -0.1393 -0.2237
1.0000 3.1173 2.4045
1.0000 -0.9220 -0.2237
1.0000 0.3766 1.0904
1.0000 -0.8565 -1.5378
1.0000 -0.9622 -0.2237
1.0000 0.7655 1.0904
1.0000 1.2965 1.0904
1.0000 -0.2940 -0.2237
1.0000 -0.1418 -1.5378
1.0000 -0.4992 -0.2237
1.0000 -0.0487 1.0904
1.0000 2.3774 -0.2237
1.0000 -1.1334 -0.2237
1.0000 -0.6829 -0.2237
1.0000 0.6610 -0.2237
1.0000 0.2508 -0.2237
1.0000 0.8007 -0.2237
1.0000 -0.2034 -1.5378
1.0000 -1.2592 -2.8519
1.0000 0.0495 1.0904
1.0000 1.4299 -0.2237
1.0000 -0.2387 1.0904
1.0000 -0.7093 -0.2237
1.0000 -0.9584 -0.2237
1.0000 0.1652 1.0904
1.0000 2.7864 1.0904
1.0000 0.2030 1.0904
1.0000 -0.4237 -1.5378
1.0000 0.2986 -0.2237
1.0000 0.7126 1.0904
1.0000 -1.0075 -0.2237
1.0000 -1.4454 -1.5378
1.0000 -0.1871 1.0904
1.0000 -1.0037 -0.2237

Y 是

  399900
329900
369000
232000
539900
299900
314900
198999
212000
242500
239999
347000
329999
699900
259900
449900
299900
199900
499998
599000
252900
255000
242900
259900
573900
249900
464500
469000
475000
299900
349900
169900
314900
579900
285900
249900
229900
345000
549000
287000
368500
329900
314000
299000
179900
299900
239500

最佳答案

我认为我的工作正常。最主要的是,在代码 1 中,您不断添加 cost(c),但在下一次迭代之前从未将其设置为零。您真正需要更改的唯一一件事是添加类似 cost(c) = 0; 的内容之后for c = 1:n和之前 for i = 1:m 。我确实必须稍微更改您的代码才能使其为我工作(主要是 computeCostMulti ),并且我更改了绘图以表明两种方法的结果相同。总的来说,这是一段包含这些更改的工作演示代码

close all; clear; clc;

%% Data
X = [1.0000 0.1300 -0.2237; 1.0000 -0.5042 -0.2237; 1.0000 0.5025 -0.2237; 1.0000 -0.7357 -1.5378;
1.0000 1.2575 1.0904; 1.0000 -0.0197 1.0904; 1.0000 -0.5872 -0.2237; 1.0000 -0.7219 -0.2237;
1.0000 -0.7810 -0.2237; 1.0000 -0.6376 -0.2237; 1.0000 -0.0764 1.0904; 1.0000 -0.0009 -0.2237;
1.0000 -0.1393 -0.2237; 1.0000 3.1173 2.4045; 1.0000 -0.9220 -0.2237; 1.0000 0.3766 1.0904;
1.0000 -0.8565 -1.5378; 1.0000 -0.9622 -0.2237; 1.0000 0.7655 1.0904; 1.0000 1.2965 1.0904;
1.0000 -0.2940 -0.2237; 1.0000 -0.1418 -1.5378; 1.0000 -0.4992 -0.2237; 1.0000 -0.0487 1.0904;
1.0000 2.3774 -0.2237; 1.0000 -1.1334 -0.2237; 1.0000 -0.6829 -0.2237; 1.0000 0.6610 -0.2237;
1.0000 0.2508 -0.2237; 1.0000 0.8007 -0.2237; 1.0000 -0.2034 -1.5378; 1.0000 -1.2592 -2.8519;
1.0000 0.0495 1.0904; 1.0000 1.4299 -0.2237; 1.0000 -0.2387 1.0904; 1.0000 -0.7093 -0.2237;
1.0000 -0.9584 -0.2237; 1.0000 0.1652 1.0904; 1.0000 2.7864 1.0904; 1.0000 0.2030 1.0904;
1.0000 -0.4237 -1.5378; 1.0000 0.2986 -0.2237; 1.0000 0.7126 1.0904; 1.0000 -1.0075 -0.2237;
1.0000 -1.4454 -1.5378; 1.0000 -0.1871 1.0904; 1.0000 -1.0037 -0.2237];
y = [399900 329900 369000 232000 539900 299900 314900 198999 212000 242500 239999 347000 329999,...
699900 259900 449900 299900 199900 499998 599000 252900 255000 242900 259900 573900 249900,...
464500 469000 475000 299900 349900 169900 314900 579900 285900 249900 229900 345000 549000,...
287000 368500 329900 314000 299000 179900 299900 239500]';

alpha = 0.01;
num_iters = 200;

% Init Theta and Run Gradient Descent
theta0 = zeros(3, 1);
[theta_result_1, J_history_1] = gradientDescentMulti(X, y, theta0, alpha, num_iters, 1);
[theta_result_2, J_history_2] = gradientDescentMulti(X, y, theta0, alpha, num_iters, 2);

% Plot the convergence graph for both methods
figure;
x = 1:numel(J_history_1);
subplot(5,1,1:4);
plot(x,J_history_1,x,J_history_2);
xlim([min(x) max(x)]);
set(gca,'XTickLabel','');
ylabel('Cost J');
grid on;

subplot(5,1,5);
stem(x,(J_history_1-J_history_2)./J_history_1,'ko');
xlim([min(x) max(x)]);
xlabel('Number of iterations');
ylabel('frac. \DeltaJ');
grid on;

% Display gradient descent's result
fprintf('Theta computed from gradient descent with method 1: \n');
fprintf(' %f \n', theta_result_1);
fprintf('Theta computed from gradient descent with method 2: \n');
fprintf(' %f \n', theta_result_2);
fprintf('\n');
<小时/>
function [theta, J_history] = gradientDescentMulti(X, y, theta, alpha, num_iters, METHOD)
% Initialize some useful values
m = length(y); % number of training examples
n = length(theta);
J_history = zeros(num_iters, 1);

costs = zeros(n,1);
for iter = 1:num_iters

if METHOD == 1 % code 1 - does work
for c = 1:n
costs(c) = 0;
for i = 1:m
costs(c) = costs(c) + (X(i,:)*theta - y(i)) *X(i,c);
end
end
elseif METHOD == 2 % code 2 - does work
E = X * theta - y;
for c = 1:n
costs(c) = sum(E.*X(:,c));
end
else
error('unknown method');
end

% update each theta
for c = 1:n
theta(c) = theta(c) - alpha*costs(c)/m;
end
J_history(iter) = computeCostMulti(X, y, theta);
end
end
<小时/>
function J = computeCostMulti(X, y, theta)
m = length(y); J = 0;
for mi = 1:m
J = J + (X(mi,:)*theta - y(mi))^2;
end
J = J/(2*m);
end

但是,您实际上只需要添加 cost(c) = 0;线。

还有;我建议始终添加 close all; clear; clc;在脚本开头添加一行,以确保在将它们复制并粘贴到堆栈溢出中时它们能够正常工作。

关于matlab - 多变量梯度下降Matlab - 两个代码有什么区别?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/51848436/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com