gpt4 book ai didi

Python 的 scipy.optimize.minimize 与 SLSQP 失败并显示 "Positive directional derivative for linesearch"

转载 作者:行者123 更新时间:2023-12-01 02:25:35 24 4
gpt4 key购买 nike

我有一个受不等式约束的最小二乘最小化问题,我正在尝试使用 scipy.optimize.minimize 来解决该问题。看来不平等约束有两种选择:COBYLA 和 SLSQP。

我首先尝试了 SLSQP,因为它允许最小化函数的显式偏导数。根据问题的规模,它会失败并出现错误:

Positive directional derivative for linesearch    (Exit mode 8)

每当施加区间或更一般的不等式约束时。

之前已经观察到这一点,例如 here 。手动缩放要最小化的函数(以及相关的偏导数)似乎可以解决问题,但我无法通过更改选项中的 ftol 来达到相同的效果。

总的来说,这整件事让我对日常工作的稳健性产生了怀疑。这是一个简化的示例:

import numpy as np
import scipy.optimize as sp_optimize

def cost(x, A, y):

e = y - A.dot(x)
rss = np.sum(e ** 2)

return rss

def cost_deriv(x, A, y):

e = y - A.dot(x)
deriv0 = -2 * e.dot(A[:,0])
deriv1 = -2 * e.dot(A[:,1])

deriv = np.array([deriv0, deriv1])

return deriv


A = np.ones((10,2)); A[:,0] = np.linspace(-5,5, 10)
x_true = np.array([2, 2/20])
y = A.dot(x_true)
x_guess = x_true / 2

prm_bounds = ((0, 3), (0,1))

cons_SLSQP = ({'type': 'ineq', 'fun' : lambda x: np.array([x[0] - x[1]]),
'jac' : lambda x: np.array([1.0, -1.0])})

# works correctly
min_res_SLSQP = sp_optimize.minimize(cost, x_guess, args=(A, y), jac=cost_deriv, bounds=prm_bounds, method='SLSQP', constraints=cons_SLSQP, options={'disp': True})
print(min_res_SLSQP)

# fails
A = 100 * A
y = A.dot(x_true)
min_res_SLSQP = sp_optimize.minimize(cost, x_guess, args=(A, y), jac=cost_deriv, bounds=prm_bounds, method='SLSQP', constraints=cons_SLSQP, options={'disp': True})
print(min_res_SLSQP)

# works if bounds and inequality constraints removed
min_res_SLSQP = sp_optimize.minimize(cost, x_guess, args=(A, y), jac=cost_deriv,
method='SLSQP', options={'disp': True})
print(min_res_SLSQP)

应该如何设置 ftol 以避免失败?更一般地说,COBYLA 会出现类似的问题吗?对于此类不等式约束最小二乘优化问题,COBYLA 是更好的选择吗?

发现在成本函数中使用平方根可以提高性能。然而,对于问题的非线性重新参数化(更简单但更接近我在实践中需要做的事情),它再次失败。详细信息如下:

import numpy as np
import scipy.optimize as sp_optimize


def cost(x, y, g):

e = ((y - x[1]) / x[0]) - g

rss = np.sqrt(np.sum(e ** 2))

return rss


def cost_deriv(x, y, g):

e = ((y- x[1]) / x[0]) - g

factor = 0.5 / np.sqrt(e.dot(e))
deriv0 = -2 * factor * e.dot(y - x[1]) / (x[0]**2)
deriv1 = -2 * factor * np.sum(e) / x[0]

deriv = np.array([deriv0, deriv1])

return deriv


x_true = np.array([1/300, .1])
N = 20
t = 20 * np.arange(N)
g = 100 * np.cos(2 * np.pi * 1e-3 * (t - t[-1] / 2))
y = g * x_true[0] + x_true[1]

x_guess = x_true / 2
prm_bounds = ((1e-4, 1e-2), (0, .4))

# check derivatives
delta = 1e-9
C0 = cost(x_guess, y, g)
C1 = cost(x_guess + np.array([delta, 0]), y, g)
approx_deriv0 = (C1 - C0) / delta
C1 = cost(x_guess + np.array([0, delta]), y, g)
approx_deriv1 = (C1 - C0) / delta
approx_deriv = np.array([approx_deriv0, approx_deriv1])
deriv = cost_deriv(x_guess, y, g)

# fails
min_res_SLSQP = sp_optimize.minimize(cost, x_guess, args=(y, g), jac=cost_deriv,
bounds=prm_bounds, method='SLSQP', options={'disp': True})
print(min_res_SLSQP)

最佳答案

不要最小化np.sum(e ** 2),而是最小化sqrt(np.sum(e ** 2)),或者更好(就计算):np.linalg.norm(e)!

此修改:

  • 不会改变您关于x的解决方案
  • 如果需要原始目标(可能不需要),则需要进行后处理
  • 更加稳健

有了这个改变,所有情况都可以工作,甚至使用数值微分(我懒得修改梯度,它需要反射(reflect)这一点!)。

示例输出(func-evals 的数量给出了 num-diff):

Optimization terminated successfully.    (Exit mode 0)
Current function value: 3.815547437029837e-06
Iterations: 16
Function evaluations: 88
Gradient evaluations: 16
fun: 3.815547437029837e-06
jac: array([-6.09663382, -2.48862544])
message: 'Optimization terminated successfully.'
nfev: 88
nit: 16
njev: 16
status: 0
success: True
x: array([ 2.00000037, 0.10000018])
Optimization terminated successfully. (Exit mode 0)
Current function value: 0.0002354577991007501
Iterations: 23
Function evaluations: 114
Gradient evaluations: 23
fun: 0.0002354577991007501
jac: array([ 435.97259208, 288.7483819 ])
message: 'Optimization terminated successfully.'
nfev: 114
nit: 23
njev: 23
status: 0
success: True
x: array([ 1.99999977, 0.10000014])
Optimization terminated successfully. (Exit mode 0)
Current function value: 0.0003392807206384532
Iterations: 21
Function evaluations: 112
Gradient evaluations: 21
fun: 0.0003392807206384532
jac: array([ 996.57340243, 51.19298764])
message: 'Optimization terminated successfully.'
nfev: 112
nit: 21
njev: 21
status: 0
success: True
x: array([ 2.00000008, 0.10000104])

虽然 SLSQP 可能存在一些问题,但考虑到广泛的应用范围,它仍然是经过测试最多、最强大的代码之一!

我还希望 SLSQP 在这里比 COBYLA 更好,因为后者很大程度上基于线性化。 (但只是将其作为猜测;考虑到最小化界面,很容易尝试!)

替代方案

一般来说,基于内点的凸二次规划求解器将是这里的最佳方法。但为此,你需要离开 scipy。 (或者也许 SOCP 求解器会更好......我不确定)。

cvxpy带来了一个很好的建模系统和一个很好的开源求解器( ECOS ;虽然技术上是一个圆锥求解器 -> 更通用且不太健壮;但应该击败 SLSQP)。

使用 cvxpy 和 ECOS,看起来像:

import numpy as np
import cvxpy as cvx

""" Problem data """
A = np.ones((10,2)); A[:,0] = np.linspace(-5,5, 10)
x_true = np.array([2, 2/20])
y = A.dot(x_true)
x_guess = x_true / 2

prm_bounds = ((0, 3), (0,1))

# problematic case
A = 100 * A
y = A.dot(x_true)

""" Solve """
x = cvx.Variable(len(x_true))
constraints = [x[0] >= x[1]]
for ind, (lb, ub) in enumerate(prm_bounds): # ineffecient -> matrix-based expr better!
constraints.append(x[ind] >= lb)
constraints.append(x[ind] <= ub)

objective = cvx.Minimize(cvx.norm(A*x - y))
problem = cvx.Problem(objective, constraints)
problem.solve(solver=cvx.ECOS, verbose=False)
print(problem.status)
print(problem.value)
print(x.value.T)

# optimal
# -6.67593652593801e-10
# [[ 2. 0.1]]

关于Python 的 scipy.optimize.minimize 与 SLSQP 失败并显示 "Positive directional derivative for linesearch",我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/47443122/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com