python - 梯度下降的 self 实现与 SciPy Minimize 的比较-6ren

python - 梯度下降的 self 实现与 SciPy Minimize 的比较

转载作者：行者123 更新时间：2023-12-01 07:43:14

25

4

这是我正在接受的凸优化类的作业。任务如下:

Implement the gradient descent algorithm with backtracking line search to find the optimal step size. Your implementation will be compared to Python's scipy.optimize.minimize function.

The specific function to minimize is the least squares function. The error between the solution found by the Python library and your implementation must be smaller than 0.001.

我已经实现了，但错误值徘徊在 1 左右，并且一直在寻找改进它的方法，但遇到了一些麻烦。这是我编写的代码:

梯度下降+回溯线搜索实现

import numpy as np

# Gradient descent.
def min_gd(fun, x0, grad, args=()):
    alpha = 0.3
    beta = 0.8

    delta_x = -grad(x0, *args)
    t = backtracking_line_search(fun, x0, grad, delta_x, alpha, beta, args)
    x_new = x0 + (t * delta_x)

    if np.linalg.norm(x_new) ** 2 > np.linalg.norm(x0) ** 2:
        return min_gd(fun, x_new, grad, args)
    else:
        return x_new
    
# Line search function returns optimal step size.
def backtracking_line_search(fun, x, grad, delta_x, alpha, beta, args=()):
    t = 1
    derprod = grad(x, *args) @ delta_x

    while fun((x + (t * delta_x)), *args) > fun(x, *args) + (alpha * t * derprod):
        t *= beta

    return t

其他给定函数

import numpy as np
from scipy.optimize import minimize
import gd

# Least Squares function
def LeastSquares(x, A, b):
    return np.linalg.norm(A @ x - b) ** 2

# gradient  
def grad_LeastSquares(x, A, b):
    return 2 * ((A.T @ A) @ x - A.T @ b)

两个结果之间的误差基本上是使用L2范数计算的。

我提出的一些想法是我的梯度下降函数中的容差检查点可能有缺陷。现在我本质上只是检查下一步是否比上一步更大。然而，我也无法思考如何改进这一点。

欢迎任何反馈。

编辑

如果有人对我编写的最终代码感到好奇，使其以所需的方式工作:

def min_gd(fun, x0, grad, args=()):
    alpha = 0.3
    beta = 0.8

    delta_x = -grad(x0, *args)
    t = backtracking_line_search(fun, x0, grad, delta_x, alpha, beta, args)
    x_new = x0 + (t * delta_x)
    
    if np.linalg.norm(grad(x_new, *args)) < 0.01:
        return x_new
    else:
        return min_gd(fun, x_new, grad, args)

我只是修复了条件语句，这样我不仅可以比较规范，还可以检查该值是否小于预定的容差水平。

希望这对将来的任何人都有帮助。

最佳答案

您对容差检查的猜测是正确的:当前向量的范数与收敛无关。典型的标准是小梯度，因此 min_gd 应该看起来像

def min_gd(fun, x0, grad, args=()):
    alpha = 0.3
    beta = 0.8
    eps = 0.001

    x_new = x0
    delta_x = -grad(x0, *args)
    while np.linalg.norm(delta_x) > eps:
        t = backtracking_line_search(fun, x_new, grad, delta_x, alpha, beta, args)
        x_new = x_new + (t * delta_x)
        delta_x = -grad(x_new, *args)

    return x_new

其中 eps 是一些小的正容差。

关于python - 梯度下降的 self 实现与 SciPy Minimize 的比较，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/56586436/

25

4

0

文章推荐： Java - 设置上下文属性(ServletContextListener)

文章推荐： python - Django TypeError __str__ 返回非字符串(类型元组)

文章推荐： python - 如何从 Excel 工作表中提取日期？

Scipy 和 CX_freeze - 导入 scipy : you cannot import scipy while being in scipy source directory 时出错
我在使用 cx_freeze 和 scipy 时无法编译 exe。特别是，我的脚本使用 from scipy.interpolate import griddata 构建过程似乎成功完成，但是当我尝试
scipy - SciPy 中由函数定义的稀疏矩阵
是否可以通过函数在 scipy 中定义一个稀疏矩阵，而不是列出所有可能的值？在文档中，我看到可以通过以下方式创建稀疏矩阵 There are seven available sparse matrix
scipy - SciPy:Minimumsq与Minimum_squares
SciPy为非线性最小二乘问题提供了两种功能： optimize.leastsq()仅使用Levenberg-Marquardt算法。 optimize.least_squares()允许我们选择Le
scipy - SciPy 中的复杂求解器
SciPy 中的求解器能否处理复数值(即 x=x'+i*x")？我对使用 Nelder-Mead 类型的最小化函数特别感兴趣。我通常是 Matlab 用户，我知道 Matlab 没有复杂的求解器。如果
scipy - 如何使用 scipy 计算三次样条插值的导数？
我有看起来像这样的数据集: position number_of_tag_at_this_position 3 4 8 6 13 25 23 12 我想对这个数据集应用三次样条插值来插值标签密度；为此
scipy - 如何使用 Scipy 处理巨大的稀疏矩阵构造？
所以，我正在处理维基百科转储，以计算大约 5,700,000 个页面的页面排名。这些文件经过预处理，因此不是 XML 格式。它们取自 http://haselgrove.id.au/wikipedi
scipy - 在 scipy 中获取非归一化特征向量
Scipy 和 Numpy 返回归一化的特征向量。我正在尝试将这些向量用于物理应用程序，我需要它们不被标准化。例如a = np.matrix('-3, 2; -1, 0') W,V = spl.ei
scipy - 有没有办法将 scipy.optimize.fsolve 与 jit_integrand_function 和 scipy.integrate.quad 一起使用？
基于此处提供的解释 1 ，我正在尝试使用相同的想法来加速以下积分: import scipy.integrate as si from scipy.optimize import root, fsol
scipy - 导入 scipy 或 scipy.signal 时 Pyinstaller --onefile 警告 pyconfig.h
这很容易重新创建。如果我的脚本 foo.py 是: import scipy 然后运行: python pyinstaller.py --onefile foo.py 当我启动 foo.exe 时，
python - 为什么 from scipy import spatial 有效，而 scipy.spatial 在 import scipy 后不起作用？
我想在我的代码中使用 scipy.spatial.distance.cosine。如果我执行类似 import scipy.spatial 或 from scipy import spatial 的操
scipy - 如何使用 scipy.integrate.quadpack(或 scipy 中的其他 c/fortran)直接作为来自 cython 的 c
Numpy 有一个基本的 pxd，声明它的 c 接口(interface)到 cython。是否有用于 scipy 组件(尤其是 scipy.integrate.quadpack)的 pxd？或者，
scipy - 理解 scipy.stats.chisquare
有人可以帮我处理 scipy.stats.chisquare 吗？我没有统计/数学背景，我正在使用来自 https://en.wikipedia.org/wiki/Chi-squared_test 的
scipy - 如何使用 scipy.odr 估计拟合优度？
我正在使用 scipy.odr 拟合数据与权重，但我不知道如何获得拟合优度或 R 平方的度量。有没有人对如何使用函数存储的输出获得此度量有建议？最佳答案 res_var Output 的属性是所谓的
scipy - pip 无法为 scipy 构建轮子
我刚刚下载了新的 python 3.8，我正在尝试使用以下方法安装 scipy 包: pip3.8 install scipy 但是构建失败并出现以下错误: **Failed to build sci
scipy - 如何使用带有自己的三角测量的 scipy.interpolate.LinearNDInterpolator
我有 my own triangulation algorithm它基于 Delaunay 条件和梯度创建三角剖分，使三角形与梯度对齐。这是一个示例输出: 以上描述与问题无关，但对于上下文是必要的。
scipy - scipy.stats.norm 上下文中的概率密度函数是什么？
这是一个非常基本的问题，但我似乎找不到好的答案。 scipy 到底计算什么内容 scipy.stats.norm(50,10).pdf(45) 据我了解，平均值为 50、标准差为 10 的高斯中像 4
scipy - 在 Scipy.signal 中拟合传递函数模型
我正在使用 curve_fit 来拟合一阶动态系统的阶跃响应，以估计增益和时间常数。我使用两种方法。第一种方法是在时域中拟合从函数生成的曲线。 # define the first order dyn
scipy - 使用 scipy.stats 计算条件期望
让我们假设 x ~ Poisson(2.5);我想计算类似 E(x | x > 2) 的东西。我认为这可以通过 .dist.expect 运算符来完成，即: D = stats.poisson(2.
scipy - 区分 OpenMDAO SciPy SLSQP 中的迭代和函数评估
我正在通过 OpenMDAO 使用 SLSQP 来解决优化问题。优化工作充分；最后的 SLSQP 输出如下: Optimization terminated successfully. (Exi
python - Scipy 最小化/Scipy 曲线拟合/lmfit
log( VA ) = gamma - (1/eta)log[alpha L ^(-eta) + 测试版 K ^(-eta)] 我试图用非线性最小二乘法估计上述函数。我为此使用了 3 个不同的包(Sc

首页

博学

6Ren·AI

商城

python - 梯度下降的 self 实现与 SciPy Minimize 的比较