gpt4 book ai didi

machine-learning - 我应该避免将 L2 正则化与 RMSProp 结合使用吗?

转载 作者:行者123 更新时间:2023-11-30 08:34:04 24 4
gpt4 key购买 nike

我应该避免将 L2 正则化与 RMSprop 和 NAG 结合使用吗?

L2 正则化项会干扰梯度算法 (RMSprop)?

最诚挚的问候,

最佳答案

似乎有人已经解决了(2018)问题(2017)。

普通自适应梯度(RMSProp、Adagrad、Adam 等)与 L2 正则化不能很好地匹配。

论文链接 [ https://arxiv.org/pdf/1711.05101.pdf]和一些介绍:

In this paper, we show that a major factor of the poor generalization of the most popular adaptive gradient method, Adam, is due to the fact that L2 regularization is not nearly as effective for it as for SGD.

L2 regularization and weight decay are not identical. Contrary to common belief, the two techniques are not equivalent. For SGD, they can be made equivalent by a reparameterization of the weight decay factor based on the learning rate; this is not the case for Adam. In particular, when combined with adaptive gradients, L2 regularization leads to weights with large gradients being regularized less than they would be when using weight decay.

关于machine-learning - 我应该避免将 L2 正则化与 RMSProp 结合使用吗?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/42415319/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com