gpt4 book ai didi

python - 动量反向传播

转载 作者:太空狗 更新时间:2023-10-29 21:38:39 27 4
gpt4 key购买 nike

我正在关注 this tutorial用于实现反向传播算法。但是,我一直坚持为该算法实现动量。

没有Momentum,这是权重更新方法的代码:

def update_weights(network, row, l_rate):
for i in range(len(network)):
inputs = row[:-1]
if i != 0:
inputs = [neuron['output'] for neuron in network[i - 1]]
for neuron in network[i]:
for j in range(len(inputs)):
neuron['weights'][j] += l_rate * neuron['delta'] * inputs[j]
neuron['weights'][-1] += l_rate * neuron['delta']

下面是我的实现:

def updateWeights(network, row, l_rate, momentum=0.5):
for i in range(len(network)):
inputs = row[:-1]
if i != 0:
inputs = [neuron['output'] for neuron in network[i-1]]
for neuron in network[i]:
for j in range(len(inputs)):
previous_weight = neuron['weights'][j]
neuron['weights'][j] += l_rate * neuron['delta'] * inputs[j] + momentum * previous_weight
previous_weight = neuron['weights'][-1]
neuron['weights'][-1] += l_rate * neuron['delta'] + momentum * previous_weight

这给了我一个 Mathoverflow 错误,因为权重在多个时期内呈指数级变得太大。我相信我的 previous_weight 逻辑对于更新是错误的。

最佳答案

我给你一个提示。您在实现中将 momentum 乘以 previous_weight,这是同一步骤中网络的另一个参数。这显然很快就会爆炸。

你应该做的是记住整个更新向量, l_rate * neuron['delta'] * inputs[j],在之前的反向传播步骤上,并将其相加。它可能看起来像这样:

velocity[j] = l_rate * neuron['delta'] * inputs[j] + momentum * velocity[j]
neuron['weights'][j] += velocity[j]

... 其中 velocity 是一个与 network 长度相同的数组,定义的范围比 updateWeights 大,并用零初始化.参见 this post了解详情。

关于python - 动量反向传播,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/47211478/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com