gpt4 book ai didi

python - 神经网络 MNIST

转载 作者:太空宇宙 更新时间:2023-11-04 02:40:37 26 4
gpt4 key购买 nike

我研究神经网络有一段时间了,用python和numpy做了一个实现。我用 XOR 做了一个非常简单的例子,它运行良好。所以我想我更进一步尝试 MNIST 数据库。

这是我的问题。我正在使用具有 784 个输入、30 个隐藏神经元和 10 个输出神经元的神经网络。隐藏层的激活函数只会吐出一个,这样网络就基本停止学习了。我所做的数学运算是正确的,相同的实现在 XOR 示例中运行良好,我正在正确读取 MNIST 集。所以我看不出问题出在哪里。

import pickle
import gzip

import numpy as np

def load_data():
f = gzip.open('mnist.pkl.gz', 'rb')
training_data, validation_data, test_data = pickle.load(f, encoding="latin1")
f.close()
return (training_data, validation_data, test_data)

def transform_output(num):
arr = np.zeros(10)
arr[num] = 1.0
return arr

def out2(arr):
return arr.argmax()


data = load_data()
training_data = data[0]
training_input = np.array(training_data[0])
training_output = [transform_output(y) for y in training_data[1]]

batch_size = 10

batch_count = int(np.ceil(len(training_input) / batch_size))

input_batches = np.array_split(training_input, batch_count)
output_batches = np.array_split(training_output, batch_count)

#Sigmoid Function
def sigmoid (x):
return 1.0/(1.0 + np.exp(-x))

#Derivative of Sigmoid Function
def derivatives_sigmoid(x):
return x * (1.0 - x)

#Variable initialization
epoch=1 #Setting training iterations
lr=2.0 #Setting learning rate
inputlayer_neurons = len(training_input[0]) #number of features in data set
hiddenlayer_neurons = 30 #number of hidden layers neurons

output_neurons = len(training_output[0]) #number of neurons at output layer

#weight and bias initialization
wh=np.random.uniform(size=(inputlayer_neurons,hiddenlayer_neurons))
bh=np.random.uniform(size=(1,hiddenlayer_neurons))
wout=np.random.uniform(size=(hiddenlayer_neurons,output_neurons))
bout=np.random.uniform(size=(1,output_neurons))

for i in range(epoch):
for batch in range(batch_count):

X = input_batches[batch]
y = output_batches[batch]

zh1 = np.dot(X, wh)
zh = zh1 + bh

# data -> hidden neurons -> activations
ah = sigmoid(zh)

zo1 = np.dot(ah, wout)
zo = zo1 + bout

output = sigmoid(zo)

# data -> output neurons -> error
E = y - output

print("debugging")
print("X")
print(X)
print("WH")
print(wh)
print("zh1")
print(zh1)
print("bh")
print(bh)
print("zh")
print(zh)
print("ah")
print(ah)
print("wout")
print(wout)
print("zo1")
print(zo1)
print("bout")
print(bout)
print("zo")
print(zo)
print("out")
print(output)
print("y")
print(y)
print("error")
print(E)
# data -> output neurons -> slope
slope_out = derivatives_sigmoid(output)

# data -> output neurons -> change of error
d_out = E * slope_out

# data -> hidden neurons -> error = data -> output neurons -> change of error DOT output neurons -> output inputs (equal to hidden neurons) -> weights
error_hidden = d_out.dot(wout.T)

# data -> hidden neurons -> slope
slope_h = derivatives_sigmoid(ah)

# data -> hidden neurons -> change of error
d_hidden = error_hidden * slope_h

# hidden neurons -> output neurons -> weights = "" + hidden neurons -> data -> activations DOT data -> output neurons -> change of error
wout = wout + ah.T.dot(d_out) * lr
bout = bout + np.sum(d_out, axis=0, keepdims=True) * lr

wh = wh + X.T.dot(d_hidden) * lr
bh = bh + np.sum(d_hidden, axis=0, keepdims=True) * lr
# testing results
X = np.array(data[1][0][0:10])
zh1 = np.dot(X, wh)
zh = zh1 + bh

# data -> hidden neurons -> activations
ah = sigmoid(zh)

zo1 = np.dot(ah, wout)
zo = zo1 + bout

output = sigmoid(zo)
print([out2(y) for y in output])
print(data[1][1][0:10])

因此总体而言,神经网络的输出对于每个输入都是相同的,并且使用不同的批量大小、学习率和 100 个 epoch 对其进行训练并没有帮助。

最佳答案

XOR 和 MNIST 问题的区别在于类的数量:XOR 是一种二进制分类,而在 MNIST 中有 10 个类。

您计算的错误 E 适用于 XOR,因为 sigmoid 函数可用于二进制情况。当有超过 2 个类时,你必须使用 softmax function ,它是 sigmoid 的扩展版本,并且 cross entropy loss .看看this question看看区别。您已将 y 正确翻译为 one-hot 编码,但 output 不包含预测的概率分布,实际上包含 10 个值的向量,每个值都非常接近 1.0。这就是网络不学习的原因。

关于python - 神经网络 MNIST,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/46653114/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com