gpt4 book ai didi

scipy's direct fails (almost) immediately on this toy optimization problem(在这个玩具优化问题上,Scipy的直接(几乎)立即失败了)

转载 作者:bug小助手 更新时间:2023-10-24 23:21:39 33 4
gpt4 key购买 nike



Consider the following simple MWE:

请考虑以下简单的MWE:


import numpy as np
from scipy.optimize import direct
def score(x):
parity_in_range = len([v for v in x if 4 <= v <= 6])%3
main_score = np.max(np.abs(np.diff(x)))
return main_score + parity_in_range
length = 20
bounds = [(0,10)] * length
result = direct(score, locally_biased=False, bounds=bounds, maxiter=10000, maxfun=10000)
print(result)

An optimal solution is to make all the parameters equal and not between 4 and 6. E.g. all 3s. This gives a function value of 0. The optimization works with varying degrees of success with the different optimizers of scipy but it fails almost instantly with direct. It gives:

最佳解决方案是使所有参数相等,而不是介于4和6之间。例如,全部为3。这给出的函数值为0。对于不同的Scipy优化器,优化的成功程度各不相同,但使用DIRECT几乎立即失败。它提供了:


 message: The volume of the hyperrectangle containing the lowest function value found is below vol_tol=1e-16
success: True
status: 4
fun: 2.0
x: [ 5.000e+00 5.000e+00 ... 5.000e+00 5.000e+00]
nit: 2
nfev: 157

I am not sure that it should report success but the real problem is that it gives up after 157 function evaluations with that warning.

我不确定它是否应该报告成功,但真正的问题是,它在157次函数评估后放弃了,并发出了警告。


Is there any way to get direct to optimize this function?

有什么方法可以直接优化这个函数吗?


更多回答
优秀答案推荐

The termination parameter vol_tol implicitly depends on the number of dimensions of the problem. The search space is divided up into a series of hyperrectangles, with the smallest middle hyperrectangle having a size of (1/3)^n, where n is the number of dimensions.

终止参数VOL_TOL隐含地取决于问题的维度数。搜索空间被分成一系列超矩形,其中最小的中间超矩形的大小为(1/3)^n,其中n是维度的数目。


With n=20, this means that the innermost cube will have a volume of 2.8e-10. If that innermost cube's midpoint happens to be the lowest point, then that cube will be subdivided again. Since vol_tol defaults to 1e-16, this means that the algorithm will exit after only two iterations.

当n=20时,这意味着最内侧的立方体的体积将为2.8e-10。如果最里面的立方体的中点恰好是最低点,那么该立方体将再次细分。由于Vol_tol默认为1e-16,这意味着算法将仅在两次迭代后退出。


If you don't want vol_tol to cause DIRECT to exit early, you can set vol_tol to zero:

如果不希望VOL_TOL导致DIRECT提前退出,可以将VOL_TOL设置为零:


result = direct(score, locally_biased=False, bounds=bounds, maxiter=10000, maxfun=10000, vol_tol=0)

Running this, it finds a better solution, though still not an optimal one:

运行它,它找到了一个更好的解决方案,尽管仍然不是最优的解决方案:


 message: Number of function evaluations done is larger than maxfun=10000
success: False
status: 1
fun: 1.1111111111111112
x: [ 3.889e+00 3.889e+00 ... 5.000e+00 5.000e+00]
nit: 12
nfev: 12021

Of course, you could also solve this problem by making the function simpler, e.g. making the parity_in_range objective leaky.

当然,您也可以通过使函数更简单来解决此问题,例如,使Parity_in_Range目标发生泄漏。


Altering the objective function to be continuous


It's frequently easier to optimize an objective function if that function is continuous.

如果目标函数是连续的,那么通常更容易对该函数进行优化。


In the following graph, the blue line represents the existing parity_in_range function for each value, ignoring the mod 3.

在下图中,蓝线表示每个值的现有parity_in_range函数,忽略mod 3。


The orange line represents a new function, which slopes down toward the edge, giving the optimizer a hint that there is a lower value in that direction.

橙色线表示一个新函数,该函数向下倾斜到边缘,提示优化器在该方向上存在较低的值。


objective function graph


First, define the primitives that make up this curve. I'm using a sigmoid function as a continuous approximation of the step function.

首先,定义组成这条曲线的基本体。我使用Sigmoid函数作为阶跃函数的连续近似值。


def sigmoid(x):
return 1 / (1 + np.exp(-x))

Next, we need to be able to shift the center of this function around, and make the sigmoid function curve up and down faster.

下一步,我们需要能够移动这个函数的中心,并使Sigmoid函数更快地上下曲线。


def sigmoid_at_center(x, center, strength=1):
return sigmoid((x - center) * strength)

Next, define the parity function as a sigmoid centered at 4, minus a sigmoid centered at 6. The strength parameter is set to 10.

接下来,将奇偶函数定义为以4为中心的Sigmoid,减去以6为中心的Sigmoid。强度参数设置为10。


def get_leaky_parity(x):
return sigmoid_at_center(x, 4, 10) - sigmoid_at_center(x, 6, 10)

Finally, define the score function in terms of this function.

最后,根据该函数定义得分函数。


def score(x):
parity_in_range = get_leaky_parity(x).sum()
main_score = np.max(np.abs(np.diff(x)))
return main_score + parity_in_range

You can then use the following code to use DIRECT to optimize this. I found that local bias made it able to solve it much faster.

然后,您可以使用以下代码使用DIRECT对其进行优化。我发现,局部偏见使它能够更快地解决问题。


result = direct(score, locally_biased=True, bounds=bounds, vol_tol=0, len_tol=0.001)

With this change to the objective function, it's able to solve this problem in up to 96 dimensions.

通过对目标函数的这种改变,它能够在高达96个维度上解决这个问题。


Sources used: Lipschitzian Optimization
Without the Lipschitz Constant

使用的资料来源:不含Lipschitz常数的Lipschitzian最优化


更多回答

What does it mean to make it leaky?

让它漏水是什么意思?

@Simd I mean changing it so that instead of instantly going from 0 to 1 in that range, it goes up and down more slowly. The effect of this is that an optimizer will know to move toward the edge. Here's a plot of how the objective function looks for each variable: i.imgur.com/Fdf5YoU.png I made this function from two sigmoid curves. Using this plus local bias allows it to solve this problem pretty much instantly. I can add code for this if it's useful to you.

@SIMD我的意思是改变它,让它不是立即在那个范围内从0到1,而是上升和下降得更慢。这样做的效果是,优化器将知道向边缘移动。这是每个变量的目标函数的曲线图:i.imgur.com/Fdf5YoU.png我用两条Sigmoid曲线制作了这个函数。利用这种外加本地偏见,它几乎可以立即解决这个问题。如果对您有用,我可以为它添加代码。

Yes please. That would be really great, thank you.

好的有劳了。那就太好了,谢谢你。

@Simd I've added a section on changing the objective function to make it easier to optimize.

@SIMD我增加了一节关于更改目标函数,以使其更容易优化。

Would this ever find an optimal solution where the number of parameters between 4 and 6 is exactly 3?

这会不会找到4到6之间的参数正好是3的最优解呢?

33 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com