python - 神经网络中的未知错误。是因为矩阵不可交换吗？-6ren

python - 神经网络中的未知错误。是因为矩阵不可交换吗？

转载作者：行者123 更新时间：2023-11-30 09:35:36

24

4

我的第一个神经网络遇到了麻烦。我根本找不到错误的根源。

问题

阅读本书"Make your own neural network"作者:Tariq Rashid 我尝试使用神经网络实现手写识别，它可以对图像进行分类并确定写下 0 到 9 中的哪个数字。

训练神经网络后，测试显示每个字母的匹配度约为 99%，这显然是错误的。

怀疑

在书中，作者处理 NN 矩阵的方法与我的方法有点不同。例如，他将输入隐藏层权重与输入相乘，而我则通过将输入与输入隐藏层权重相乘来实现这一点。

以下是我在查询 NN(前馈)时进行矩阵乘法的方式的说明:

我知道矩阵不具有 commutative property for dot product但我没有注意到我在那里犯了错误。

我应该采取不同的方法，即转置所有矩阵并以不同的顺序相乘吗？
输入和输出矩阵的维度是否存在事实上的标准，即它们的形状应该为 1×n 还是 n×1？

如果这是错误的方法，那么它肯定会在使用梯度下降进行训练的反向传播中体现出来。

源代码

import numpy as np
import matplotlib.pyplot
from matplotlib.pyplot import imshow
import scipy.special as scipy
from PIL import Image

class NeuralNetwork(object):
    def __init__(self):
        self.input_neuron_count = 28*28 # One for each pixel, 28*28 = 784 in total.
        self.hidden_neuron_count = 100 # Arbitraty.
        self.output_neuron_count = 10 # One for each digit from 0 to 9.
        self.learning_rate = 0.1 # Arbitraty.

        # Sampling the weights from a normal probability distribution
        # centered around zero and with standard deviation 
        # that is related to the number of incoming links into a node,
        # 1/√(number of incoming links).
        generate_random_weight_matrix = lambda input_neuron_count, output_neuron_count: ( 
            np.random.normal(0.0,  pow(input_neuron_count, -0.5), (input_neuron_count, output_neuron_count))
        )

        self.input_x_hidden_weights = generate_random_weight_matrix(self.input_neuron_count, self.hidden_neuron_count)
        self.hidden_x_output_weights = generate_random_weight_matrix(self.hidden_neuron_count, self.output_neuron_count)

        self.activation_function = lambda value: scipy.expit(value) # Sigmoid function

    def train(self, input_array, target_array):
        inputs = np.array(input_array, ndmin=2)
        targets = np.array(target_array, ndmin=2)

        hidden_layer_input = np.dot(inputs, self.input_x_hidden_weights)
        hidden_layer_output = self.activation_function(hidden_layer_input)

        output_layer_input = np.dot(hidden_layer_output, self.hidden_x_output_weights)
        output_layer_output = self.activation_function(output_layer_input)

        output_errors = targets - output_layer_output
        self.hidden_x_output_weights += self.learning_rate * np.dot(hidden_layer_output.T, (output_errors * output_layer_output * (1 - output_layer_output)))

        hidden_errors = np.dot(output_errors, self.hidden_x_output_weights.T)
        self.input_x_hidden_weights += self.learning_rate * np.dot(inputs.T, (hidden_errors * hidden_layer_output * (1 - hidden_layer_output)))

    def query(self, input_array):
        inputs = np.array(input_array, ndmin=2)

        hidden_layer_input = np.dot(inputs, self.input_x_hidden_weights)
        hidden_layer_output = self.activation_function(hidden_layer_input)

        output_layer_input = np.dot(hidden_layer_output, self.hidden_x_output_weights)
        output_layer_output = self.activation_function(output_layer_input)

        return output_layer_output

复制(训练和测试)

训练和测试数据的原始来源来自The MNIST Database 。我使用了从书籍作者网页 The MNIST Dataset of Handwitten Digits 下载的 CSV 版本。 .

这是我迄今为止用于训练和测试的代码:

def prepare_data(handwritten_digit_array):
    return ((handwritten_digit_array / 255.0 * 0.99) + 0.0001).flatten()

def create_target(digit_target):
    target = np.zeros(10) + 0.01
    target[digit_target] = target[digit_target] + 0.98
    return target

# Training
neural_network = NeuralNetwork()
training_data_file = open('mnist_train.csv', 'r')
training_data = training_data_file.readlines()
training_data_file.close()

for data in training_data:
    handwritten_digit_raw = data.split(',')
    handwritten_digit_array = np.asfarray(handwritten_digit_raw[1:]).reshape((28, 28))
    handwritten_digit_target = int(handwritten_digit_raw[0])
    neural_network.train(prepare_data(handwritten_digit_array), create_target(handwritten_digit_target))

# Testing
test_data_file = open('mnist_test_10.csv', 'r')
test_data = test_data_file.readlines()
test_data_file.close()

for data in test_data:
    handwritten_digit_raw = data.split(',')
    handwritten_digit_array = np.asfarray(handwritten_digit_raw[1:]).reshape((28, 28))
    handwritten_digit_target = int(handwritten_digit_raw[0])
    output = neural_network.query(handwritten_digit_array.flatten())
    print('target', handwritten_digit_target)
    print('output', output)

最佳答案

这是那些捂脸时刻之一。神经网络一直按预期工作。事实是，我现在注意到我忽略了测试结果并且错误地读取了用科学计数法书写的数字。

根据 MNIST 数据库的 10000 个测试数据进行测量，该神经网络的准确度为 94.01%。

关于python - 神经网络中的未知错误。是因为矩阵不可交换吗？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/43701542/

24

4

0

文章推荐： javascript - 响应式 YouTube 嵌入未在 Firefox 中显示

文章推荐： java - 根据用户输入生成字母金字塔

postgresql - 函数交叉表(未知，未知)不存在但确实存在
我有一个交叉表函数，我过去曾多次成功使用它，但现在它在最后转储所有数据，而不是将其旋转到输出表中。它似乎无法找到交叉表。我通过以下方式对其进行了研究；如果 tablefunc 不存在则创建扩展； -
SQL 查询计数所有(未知，已知)，未知，已知客户，通过电话 Mac 地址唯一标识
表1(客户表) Id, CustomerId, IsKnownCustomer,phonemacaddress 1, 空 0 00:9a:34:cf:a4 2, 004024 1 00:6f:64:c
azure - 无法拉取镜像 myapidemodocker.azurecr.io/apidemo :v4. 0:rpc 错误:代码 = 未知 desc = 未知 blob
知道为什么我总是收到这个烦人且无用的错误代码/描述吗？ Failed to pull image myapidemodocker.azurecr.io/apidemo:v4.0: rpc error:
PHP登录问题；未知
我正在进行 PHP 登录，并且之前可以正常工作，但我尝试使用户名功能不区分大小写，但此后代码一直无法正常工作。我删除了我添加的所有内容，以尝试使其不区分大小写，即 strtolower()。页面上显示
php - 第0行的PHP缓冲错误<未知>
有人会帮助我提供有关此错误的任何可能信息吗？原因？登录？在哪里寻找/开始？ Cannot use output buffering in output buffering display handl
javascript - $routeProvider 未知
我已经添加了这样的脚本我在我的 test.js 中做了这个 var app = angular.module('MyApp', ['ngRoute']).config
java - SSO，未知
关闭。这个问题需要更多focused .它目前不接受答案。想改进这个问题吗？更新问题，使其只关注一个问题 editing this post . 关闭 8 年前。 Improve this qu
mysql 语句 WHERE 未知
我有这个sql语句: selectAllUsersByCriteria = 连接.prepareStatement( “从用户那里选择*？=？” )；下面的方法运行该语句: public Array
android:textCursorDrawable 未知
我有一个白色的 EditText，在 Android 3.1 及更高版本中，光标不显示(因为它也是白色的)。有关信息，我使用 android:background="@android:drawable
python - 未知？塑造keras深度学习
我正在尝试使用 Keras 实现深度学习模型。但是我有一个未知形状实现的问题。我一直在寻找类似的错误，但没有找到。这是我的代码。 Xhome = dataset[:,32:62] Xaway = d
c# - XMLReader 未知
关注此introduction可以通过导入命名空间 System.Xml 来使用 XMLReader 类。在我的 Visual Studio 项目中，我使用 .NET 4.0，但 System.Xml
c++ - 通过指向错误函数类型的指针调用函数(未知)
我有一个动态链接库的程序。该程序将函数指针传递给该库以执行。但是 ubsan(Undefined Behavior Sanitizer)指定指针位于错误的函数类型上。那只会发生如果回调函数有一个类
ios - AVAudioSession 未知
我正在尝试在我的 Swift SpriteKit 应用程序中使用 AVAudioSession。我遇到了奇怪的“未声明类型”问题。例如…… import AVFoundation var audioS
c++ - 专门化变量的值在编译时是否已知/未知
如果在编译期间(在实际编译和运行程序之前)其参数之一的值已知/未知，如何专门化模板函数？我还不知道怎么做。想法 1: #include #include int main(void){
c# - 未知 while while 语句
我看到一些人的代码是这样的: while (!(baseType == typeof(Object))) { .... baseType = baseType.BaseType;
具有不同(未知)字符串匹配的正则表达式
我正在尝试使用 GoColly 框架获取所有 HREF 链接，但是只允许任何域的 url 为根 URL 或子域(否路径)。我已经注释掉了我的 REGEXP。文件扩展名没有事情。我只是在“/”之后不想要
java - 抽象模式类型 'User_Book' 未知
我有一个包含多个实体的数据库，特别是 Book 和 User。它们之间存在这样的 ManyToMany 关系: 书: @Entity @Table(name = "Books") public cla
vba - 如何将一系列行排序到一定数量的(未知)列？
如果我将范围的初始部分设置为 Range("A:A")，如何确保将整行传递给排序？数据 id、fname、mname、lname、后缀、状态、位置、时区通过在 id 中搜索起点和终点来选择范围。
Kubernetes AutoScaler未缩放，HPA显示目标<未知>
我对kubernetes很陌生，而对于docker来说就不那么多了。我一直在研究示例，但是我对自动缩放器(似乎无法缩放)感到困惑。我在这里通过示例https://kubernetes.io/doc
Silverlight 工具包命名空间为 "sometimes"未知
我在 ChildWindow 中使用 SL Toolkit 5 中的 BusyIndicator 控件。在某些解决方案中，它可以工作，但在其他解决方案中，使用完全相同的代码(至少看起来)，我在运

首页

博学

6Ren·AI

商城

python - 神经网络中的未知错误。是因为矩阵不可交换吗？

问题

怀疑

源代码

复制(训练和测试)