python - 梯度下降 ANN - MATLAB 正在做什么而我没有做什么？-6ren

python - 梯度下降 ANN - MATLAB 正在做什么而我没有做什么？

转载作者：行者123 更新时间：2023-11-30 09:21:46

我正在尝试使用梯度下降反向传播在 Python 中重新创建一个简单的 MLP 人工神经网络。我的目标是尝试重新创建 MATLAB 的 ANN 所产生的精度，但我什至还没有接近。我使用与 MATLAB 相同的参数；相同数量的隐藏节点 (20)、1000 个纪元、0.01 的学习率 (alpha) 和相同的数据(显然)，但我的代码在改进结果方面没有取得任何进展，而 MATLAB 的准确度约为 98%。

我尝试通过 MATLAB 进行调试，看看它在做什么，但运气不太好。我相信 MATLAB 将输入数据缩放到 0 到 1 之间，并为输入添加偏差，这两种方法我都在我的 Python 代码中使用过。

MATLAB 正在做什么才能产生如此高的结果？或者，更有可能的是，我在 Python 代码中做错了什么，导致结果如此糟糕？我能想到的只是权重启动不佳、数据读取不正确、处理数据操作不正确、激活函数不正确/较差(我也尝试过 tanh，结果相同)。

我的尝试如下，基于我在网上找到的代码，并稍微调整以读取我的数据，而 MATLAB 脚本(仅 11 行代码)低于此。底部是我使用的数据集的链接(我也是通过 MATLAB 获得的):

感谢您的帮助。

Main.py

import numpy as np
import Process
import matplotlib.pyplot as plt
from sklearn.metrics import confusion_matrix, classification_report
from sklearn.cross_validation import train_test_split
from sklearn.preprocessing import LabelBinarizer
import warnings


def sigmoid(x):
    return 1.0/(1.0 + np.exp(-x))


def sigmoid_prime(x):
    return sigmoid(x)*(1.0-sigmoid(x))


class NeuralNetwork:

    def __init__(self, layers):

        self.activation = sigmoid
        self.activation_prime = sigmoid_prime

        # Set weights
        self.weights = []
        # layers = [2,2,1]
        # range of weight values (-1,1)
        # input and hidden layers - random((2+1, 2+1)) : 3 x 3
        for i in range(1, len(layers) - 1):
            r = 2*np.random.random((layers[i-1] + 1, layers[i] + 1)) - 1
            self.weights.append(r)
        # output layer - random((2+1, 1)) : 3 x 1
        r = 2*np.random.random((layers[i] + 1, layers[i+1])) - 1
        self.weights.append(r)

    def fit(self, X, y, learning_rate, epochs):
        # Add column of ones to X
        # This is to add the bias unit to the input layer
        ones = np.atleast_2d(np.ones(X.shape[0]))
        X = np.concatenate((ones.T, X), axis=1)

        for k in range(epochs):

            i = np.random.randint(X.shape[0])
            a = [X[i]]

            for l in range(len(self.weights)):
                    dot_value = np.dot(a[l], self.weights[l])
                    activation = self.activation(dot_value)
                    a.append(activation)
            # output layer
            error = y[i] - a[-1]
            deltas = [error * self.activation_prime(a[-1])]

            # we need to begin at the second to last layer
            # (a layer before the output layer)
            for l in range(len(a) - 2, 0, -1):
                deltas.append(deltas[-1].dot(self.weights[l].T)*self.activation_prime(a[l]))

            # reverse
            # [level3(output)->level2(hidden)]  => [level2(hidden)->level3(output)]
            deltas.reverse()

            # backpropagation
            # 1. Multiply its output delta and input activation
            #    to get the gradient of the weight.
            # 2. Subtract a ratio (percentage) of the gradient from the weight.
            for i in range(len(self.weights)):
                layer = np.atleast_2d(a[i])
                delta = np.atleast_2d(deltas[i])
                self.weights[i] += learning_rate * layer.T.dot(delta)

    def predict(self, x):
        a = np.concatenate((np.ones(1).T, np.array(x)))
        for l in range(0, len(self.weights)):
            a = self.activation(np.dot(a, self.weights[l]))
        return a

# Create neural net, 13 inputs, 20 hidden nodes, 3 outputs
nn = NeuralNetwork([13, 20, 3])
data = Process.readdata('wine')
# Split data out into input and output
X = data[0]
y = data[1]
# Normalise input data between 0 and 1.
X -= X.min()
X /= X.max()

# Split data into training and test sets (15% testing)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.15)

# Create binay output form
y_ = LabelBinarizer().fit_transform(y_train)

# Train data
lrate = 0.01
epoch = 1000
nn.fit(X_train, y_, lrate, epoch)

# Test data
err = []
for e in X_test:
    # Create array of output data (argmax to get classification)
    err.append(np.argmax(nn.predict(e)))

# Hide warnings. UndefinedMetricWarning thrown when confusion matrix returns 0 in any one of the classifiers.
warnings.filterwarnings('ignore')
# Produce confusion matrix and classification report
print(confusion_matrix(y_test, err))
print(classification_report(y_test, err))

# Plot actual and predicted data
plt.figure(figsize=(10, 8))
target, = plt.plot(y_test, color='b', linestyle='-', lw=1, label='Target')
estimated, = plt.plot(err, color='r', linestyle='--', lw=3, label='Estimated')
plt.legend(handles=[target, estimated])
plt.xlabel('# Samples')
plt.ylabel('Classification Value')
plt.grid()
plt.show()

Process.py

import csv
import numpy as np


# Add constant column of 1's
def addones(arrayvar):
    return np.hstack((np.ones((arrayvar.shape[0], 1)), arrayvar))


def readdata(loc):
    # Open file and calculate the number of columns and the number of rows. The number of rows has a +1 as the 'next'
    # operator in num_cols has already pasted over the first row.
    with open(loc + '.input.csv') as f:
        file = csv.reader(f, delimiter=',', skipinitialspace=True)
        num_cols = len(next(file))
        num_rows = len(list(file))+1

    # Create a zero'd array based on the number of column and rows previously found.
    x = np.zeros((num_rows, num_cols))
    y = np.zeros(num_rows)

    # INPUT #
    # Loop through the input file and put each row into a new row of 'samples'
    with open(loc + '.input.csv', newline='') as csvfile:
        file = csv.reader(csvfile, delimiter=',')
        count = 0
        for row in file:
            x[count] = row
            count += 1

    # OUTPUT #
    # Do the same and loop through the output file.
    with open(loc + '.output.csv', newline='') as csvfile:
        file = csv.reader(csvfile, delimiter=',')
        count = 0
        for row in file:
            y[count] = row[0]
            count += 1

    # Set data type
    x = np.array(x).astype(np.float)
    y = np.array(y).astype(np.int)

    return x, y

MATLAB 脚本

%% LOAD DATA 
[x1,t1] = wine_dataset;

%% SET UP NN 
net = patternnet(20); 
net.trainFcn = 'traingd'; 
net.layers{2}.transferFcn = 'logsig'; 
net.derivFcn = 'logsig';

%% TRAIN AND TEST
[net,tr] = train(net,x1,t1);

数据文件可以在这里下载: input output

最佳答案

我认为您混淆了术语epoch和step。如果您已经训练了一个epoch，它通常指的是运行完所有数据。

例如:如果您有 10,000 个样本，那么您已将所有 10,000 个样本(不考虑样本的随机抽样)放入您的模型中，并每次采取一步(更新您的权重)。

修复方法:延长网络运行时间:

nn.fit(X_train, y_, lrate, epoch*len(X))

奖金:MatLab 的文档将纪元转换为(迭代) here这是误导性的，但对其发表评论here这基本上就是我上面写的。

关于python - 梯度下降 ANN - MATLAB 正在做什么而我没有做什么？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/34098558/

文章推荐： javascript - 如何在 AngularJS ng-repeat 中使用带空格的变量？

文章推荐： java - 使用java自定义处理图标

文章推荐： python - Scikit-learn 线性模型，coef_ 返回特征的高值

做 Passport nodejs时的javascript语法
我有一个关于 JavaScript 语法的问题。实际上，我在自学 MEAN 堆栈教程时想出了编码(https://thinkster.io/mean-stack-tutorial#adding-aut
Perl && 做 { 最后; };
在我的书中它使用了这样的东西: for($ARGV[0]) { Expression && do { print "..."; last; }; ... } for 循环不完整吗？另外，do 的意义何
c - 做 while 循环过早退出
我已经编写了读取开关状态的代码，如果按 3 次 # 则退出。 void allkeypadTest(void) { static uint8_t modeKeyCount=0; do
Java 做 while 猜谜游戏
因此，对于上周我必须做的作业，我必须使用 4 个 do-while 循环和 if 语句在 Java 中制作一个猜谜游戏。我无法成功完成它，类(class)已经继续，没有为我提供任何帮助。如果有人可以查
c - 做 while 和右移没有效果
int i=1,j=0,n=10,k; do{ j+=i; i<<1; printf("%d\n",i); // printf("%d\n",12<<1); }while
java - 做 while 循环问题
此代码用于基本杂货计算器的按钮。当我按下按钮时，一个输入对话框会显示您输入商品价格的位置。我遇到的问题是我无法弄清楚如何获得 do ... while 循环以使输入对话框在输入后弹出。我希望它始终恢
c++ - 做 while 循环和其他
当我在循环中修改字符串或另一个变量时，它的条件是否每次都重新计算？或者在循环开始前一次 std::string a("aa"); do { a = "aaaa"; } while(a.size<10)
C 编程做 while
我刚刚写了这个，但我找不到问题。我使用代码块并编写了这个问题 error: expected 'while' before '{' token === Build finished: 1 errors
c 做 while 循环不起作用？
do { printf("Enter number (0-6): ", ""); scanf("%d", &Num); }while(Num >= 0 && Num 表示“超过”，<表
C++ 做 while 循环
我有一个包含 10 个项目的 vector (为简单起见，所有项目都属于同一类，称其为“a”)。我想要做的是检查“A”不是 a) 隐藏墙壁或 b) 隐藏另一个“A”。我有一个碰撞函数可以做到这一点。
Android 做 while 循环
嗨，这是我的第二个问题。我有下表 |-----|-------|------|------| |._id.|..INFO.|.DONE.|.LAST.| |..1..|...A...|...N..|.
C:做 {...} while(0)？
这个问题在这里已经有了答案: 关闭 12 年前。 Possible Duplicates: Why are there sometimes meaningless do/while and if/e
f# - 让!/做!总是在新线程中运行异步对象？
来自 wikibook在 F# 上有一小部分它说: What does let! do?# let! runs an async object on its own thread, then it i
haskell - (某事-> 做)的意思
我在 Real World Haskell 书中遇到了以下函数: namesMatching pat | not (isPattern pat) = do exists do
r - 做 arrangeGrob 时是否可以裁剪图？
我有一个类似于下面的用例，我创建了多个图并使用 gridExtra 将它们排列到一些页面布局中，最后使用 ggsave 将其保存为 PDF : p1 % mutate(label2
clojure - 打嗝代码没有响应没有(做(每个级别的html5
当我使用具有 for 循环的嵌套 let 语句时，如果没有 (do (html5 ..))，我将无法运行内部 [:tr]。 (defpartial column-settings-layout [&
virtualbox - 做 vagrant up 时出错
执行 vagrant up 时出现此错误: anr@anr-Lenovo-G505s ~ $ vagrant up Bringing machine 'default' up with 'virtua
perl - 错误消息:无法对未定义的值调用方法“做”
# ################################################# # Subroutine to add data to the table Blas
powershell - 做…直到-使用ValidPattern读取主机
我想创建一个检查特定日期格式的读取主机。此外，目标是检查用户输入是否正确，如果不正确，则提示应再次弹出。当我刚接触编程时，发现了这段代码，这似乎很合适。我仍然在努力“直到” do {
tensorflow - 做 Tensorflow 教程时出错
我关注这个tutorial在谷歌云机器学习引擎上进行培训。我一步一步地跟着它，但是在将 ml 作业提交到云时我遇到了错误。我运行了这个命令。 sam@sam-VirtualBox:~/models/r

行者123

个人简介

我是一名优秀的程序员,十分优秀！

作者热门文章

滴滴打车优惠券免费领取

全站热门文章

首页

博学

6Ren·AI

商城

python - 梯度下降 ANN - MATLAB 正在做什么而我没有做什么？