gpt4 book ai didi

python - 为什么我的神经网络成本不断增加?

转载 作者:行者123 更新时间:2023-11-28 18:19:00 25 4
gpt4 key购买 nike

我已经实现了一个神经网络来预测异或门。它有 1 个输入层和 2 个节点,1 个隐藏层和 2 个节点和 1 个输出层和 1 个节点。无论我尝试做什么,我的成本都在不断增加。我试过将我的学习率设置为较小的值,但这只会使成本缓慢增加。请感谢任何提示。

import numpy as np 

train_data = np.array([[0,0],[0,1],[1,0],[1,1]]).T
labels = np.array([[0,1,1,0]])

def sigmoid(z,deriv = False):
sig = 1/(1+np.exp(-z))
if deriv == True:
return np.multiply(sig,1-sig)
return sig
w1 = np.random.randn(2,2)*0.01
b1 = np.zeros((2,1))

w2 = np.random.randn(1,2)*0.01
b2 = np.zeros((1,1))

iterations = 1000
lr = 0.1

for i in range(1000):

z1 = np.dot(w1,train_data) + b1
a1 = sigmoid(z1)

z2 = np.dot(w2,a1) + b2
al = sigmoid(z2) #forward_prop

cost = np.dot(labels,np.log(al).T) + np.dot(1-labels,np.log(1-al).T)
cost = cost*(-1/4)
cost = np.squeeze(cost)#calcost

dal = (-1/4) * (np.divide(labels,al) + np.divide(1-labels,1-al))
dz2 = np.multiply(dal,sigmoid(z2,deriv = True))
dw2 = np.dot(dz2,a1.T)
db2 = np.sum(dz2,axis=1,keepdims = True)

da1 = np.dot(w2.T,dz2)
dz1 = np.multiply(da1,sigmoid(z1,deriv = True))
dw1 = np.dot(dz1,train_data.T)
db1 = np.sum(dz1,axis=1,keepdims = True) #backprop

w1 = w1 - lr*dw1
w2 = w2 - lr*dw2
b1 = b1 - lr*db1
b2 = b2 - lr*db2 #update params

print(cost,'------',str(i))

最佳答案

主要错误在于交叉熵反向传播(推荐 these notes 进行检查)。正确的公式如下:

dal = -labels / al + (1 - labels) / (1 - al)

我还稍微简化了代码。这是一个完整的工作版本:

import numpy as np

train_data = np.array([[0,0], [0,1], [1,0], [1,1]]).T
labels = np.array([0, 1, 1, 1])

def sigmoid(z):
return 1 / (1 + np.exp(-z))

w1 = np.random.randn(2,2) * 0.001
b1 = np.zeros((2,1))

w2 = np.random.randn(1,2) * 0.001
b2 = np.zeros((1,1))

lr = 0.1
for i in range(1000):
z1 = np.dot(w1, train_data) + b1
a1 = sigmoid(z1)

z2 = np.dot(w2, a1) + b2
a2 = sigmoid(z2)

cost = -np.mean(labels * np.log(a2) + (1 - labels) * np.log(1 - a2))

da2 = (a2 - labels) / (a2 * (1 - a2)) # version #1
# da2 = -labels / a2 + (1 - labels) / (1 - a2) # version #2

dz2 = np.multiply(da2, a2 * (1 - a2))
dw2 = np.dot(dz2, a1.T)
db2 = np.sum(dz2, axis=1, keepdims=True)

da1 = np.dot(w2.T, dz2)
dz1 = np.multiply(da1, a1 * (1 - a1))
dw1 = np.dot(dz1, train_data.T)
db1 = np.sum(dz1, axis=1, keepdims=True)

w1 = w1 - lr*dw1
w2 = w2 - lr*dw2
b1 = b1 - lr*db1
b2 = b2 - lr*db2

print i, cost

关于python - 为什么我的神经网络成本不断增加?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/46494984/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com