Python机器学习数字识别-6ren

Python机器学习数字识别

转载作者：行者123 更新时间：2023-11-30 09:33:46

我正在按照此站点上的代码进行操作:

https://blog.luisfred.com.br/reconhecimento-de-escrita-manual-com-redes-neurais-convolucionais/

以下是该网站遍历的代码:

from keras. datasets import mnist
from keras. models import Sequential
from keras. layers import Dense
from keras. layers import Dropout
from keras. layers import Flatten
import numpy as np
from matplotlib import pyplot as plt
from keras. layers . convolutional import Conv2D
from keras. layers . convolutional import MaxPooling2D
from keras. utils import np_utils
from keras import backend as K
K . set_image_dim_ordering ( 'th' )
import cv2
import matplotlib. pyplot as plt
#% inline matplotlib # If you are using Jupyter, it will be useful for plotting graphics or figures inside cells

#Divided the data into subsets of training and testing.
( X_train , y_train ) , ( X_test , y_test ) = mnist. load_data ( )
# Since we are working in gray scale we can
# set the depth to the value 1.
X_train = X_train . reshape ( X_train . shape [ 0 ] , 1 , 28 , 28 ) . astype ( 'float32' )
X_test = X_test . reshape ( X_test . shape [ 0 ] , 1 , 28 , 28 ) . astype ( 'float32' )
# We normalize our data according to the
# gray scale. The floating point values are in the range [0,1], instead of [.255]
X_train = X_train / 255
X_test = X_test / 255
# Converts y_train and y_test, which are class vectors, to a binary class array (one-hot vectors)
y_train = np_utils. to_categorical ( y_train )
y_test = np_utils. to_categorical ( y_test )
# Number of digit types found in MNIST. In this case, the value is 10, corresponding to (0,1,2,3,4,5,6,7,8,9).
num_classes = y_test. shape [ 1 ]


def deeper_cnn_model ( ) :
    model = Sequential ( )
    # Convolution2D will be our input layer. We can observe that it has
    # 30 feature maps with size of 5 × 5 and an activation function of type ReLU.
    model.add ( Conv2D ( 30 , ( 5 , 5 ) , input_shape = ( 1 , 28 , 28 ) , activation = 'relu' ) )
    # The MaxPooling2D layer will be our second layer where we will have a sample window of size 2 x 2
    model.add ( MaxPooling2D ( pool_size = ( 2 , 2 ) ) )
    # A new convolutional layer, with 15 feature maps of size 3 × 3, and activation function ReLU
    model.add ( Conv2D ( 15 , ( 3 , 3 ) , activation = 'relu' ) )
    # A new subsampling with a 2x2 dimension pooling.
    model.add ( MaxPooling2D ( pool_size = ( 2 , 2 ) ) )

    # We include a dropout with a 20% probability (you can try other values)
    model.add ( Dropout ( 0.2 ) )
    # We need to convert the output of the convolutional layer, so that it can be used as input to the densely connected layer that is next.
    # What this does is "flatten / flatten" the structure of the output of the convolutional layers, creating a single long vector of features
    # that will be used by the Fully Connected layer.
    model.add ( Flatten ( ) )
    # Fully connected layer with 128 neurons.
    model.add ( Dense ( 128 , activation = 'relu' ) )
    # Followed by a new fully connected layer with 64 neurons
    model.add ( Dense ( 64 , activation = 'relu' ) )

    # Followed by a new fully connected layer with 32 neurons
    model.add ( Dense ( 32 , activation = 'relu' ) )
    # The output layer has the number of neurons compatible with the
    # number of classes to be obtained. Notice that we are using a softmax activation function,
    model.add ( Dense ( num_classes, activation = 'softmax' , name = 'preds' ) )
    # Configure the entire training process of the neural network
    model.compile ( loss = 'categorical_crossentropy' , optimizer = 'adam' , metrics = [ 'accuracy' ] )

    return model


model = deeper_cnn_model ( )
model.summary ( )
model.fit ( X_train , y_train, validation_data = ( X_test , y_test ) , epochs = 10 , batch_size = 200 )
scores = model. evaluate ( X_test , y_test, verbose = 0 )
print ( "\ nacc:% .2f %%" % (scores [1] * 100))


###enhance to check multiple numbers after the training is done

img_pred = cv2. imread ( 'five.JPG' ,   0 )

plt.imshow(img_pred, cmap='gray')
# forces the image to have the input dimensions equal to those used in the training data (28x28)
if img_pred. shape != [ 28 , 28 ] :
    img2 = cv2. resize ( img_pred, ( 28 , 28 ) )
    img_pred = img2. reshape ( 28 , 28 , - 1 ) ;
else :
    img_pred = img_pred. reshape ( 28 , 28 , - 1 ) ;

# here also we inform the value for the depth = 1, number of rows and columns, which correspond 28x28 of the image.
img_pred = img_pred. reshape ( 1 , 1 , 28 , 28 )
pred = model. predict_classes ( img_pred )
pred_proba = model. predict_proba ( img_pred )
pred_proba = "% .2f %%" % (pred_proba [0] [pred] * 100)
print ( pred [ 0 ] , "with probability of" , pred_proba )

最后，我尝试对我绘制和导入的数字 5 进行预测(我也尝试过其他手绘数字，但结果同样不佳):

img_pred = cv2. imread ( 'five.JPG' ,   0 )

plt.imshow(img_pred, cmap='gray')
# forces the image to have the input dimensions equal to those used in the training data (28x28)
if img_pred. shape != [ 28 , 28 ] :
    img2 = cv2. resize ( img_pred, ( 28 , 28 ) )
    img_pred = img2. reshape ( 28 , 28 , - 1 ) ;
else :
    img_pred = img_pred. reshape ( 28 , 28 , - 1 ) ;

# here also we inform the value for the depth = 1, number of rows and columns, which correspond 28x28 of the image.
img_pred = img_pred. reshape ( 1 , 1 , 28 , 28 )
pred = model. predict_classes ( img_pred )
pred_proba = model. predict_proba ( img_pred )
pred_proba = "% .2f %%" % (pred_proba [0] [pred] * 100)
print ( pred [ 0 ] , "with probability of" , pred_proba )

这是 5.jpg:

hand drawn five image

但是当我输入自己的数字时，模型预测错误。对于为什么会这样有什么想法吗？我承认我是机器学习新手，刚刚开始涉足它。我的想法是图像的居中或图像的标准化可能已关闭？非常感谢任何帮助!

编辑1:

MNIST 测试编号将如下所示:

white numbers black backgrounds

最佳答案

您似乎有两个问题，正如您怀疑的那样，这两个问题与数据的预处理有关。

第一个是你的图像相对于训练数据是倒置的:

使用 img_pred = cv2 读取 .jpg 的一个 channel 后。 imread ( ' Five.JPG' , 0 )，背景像素接近白色，值在 215-238 附近。
如果查看 X_train 中的训练数据，背景像素全部为零，数字为白色或接近白色(上部 210-255)。

尝试在 X_train 中的一些选择旁边绘制图像，您会看到它们是倒置的。

另一个问题是 cv2.resize() 中的默认插值不会保留数据的缩放比例。调整数据大小后，最小值会跳至 60，而不是 0。比较 img.pred.min() 和 img.pred.max() 的值> 重新调整步骤之前和之后。

您可以使用如下函数反转和缩放数据，使其看起来更像 MNIST 输入数据:

 def mnist_bytescale(image):
    # Use float for rescaling
    img_temp = image.astype(np.float32)
    #Re-zero the data
    img_temp -= img_temp.min()
    #Re-scale and invert
    img_temp /= (img_temp.max()-img_temp.min())
    img_temp *= 255
    return 255 - img_temp.astype('uint')

这将翻转您的数据，并将其从 0 线性缩放到 255，就像网络正在训练的数据一样。但是，如果您绘制 mnist_bytescale(img_pred)，您会注意到大多数像素中的背景级别仍然不完全为 0，因为原始图像的背景级别不是恒定的(可能是由于 JPEG 压缩) 。)如果您的网络仍然存在此翻转和缩放数据的问题，您可以尝试使用 np.clip将背景水平清零，看看是否有帮助。

关于Python机器学习数字识别，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/49741671/

文章推荐： python - 全局步长不从 0 开始

文章推荐： java - 更改Android中首字母的字体

javascript - 正则表达式匹配字符/数字/数字/数字
fiddle :http://jsfiddle.net/rtucgv74/ 我正在尝试将第一个字符与 3 位数字匹配。所以下面的代码应该提醒f234。但反而返回 null ？源代码: var reg
asp正则表达式匹配数字$数字$数字$
复制代码代码如下: Dim strOk,strNo strOk = "12312321$12
c#数字/数字/字符串模式的正则表达式
我想找 {a number} / { a number } / {a string}模式。我可以得到number / number工作，但是当我添加 / string它不是。我试图找到的例子: 15
java - 数字.数字.数字的模式表达式？
我，我正在做一个模式正则表达式来检查字符串是否是: 数字.数字.数字，如下所示: 1.1.1 0.20.2 58.55541.5221 在java中我使用这个: private static Patt
python - 检查字符串是否包含python中的数字/数字/数字
我有一个字符串，我需要检查它是否在字符串的末尾包含一个数字/数字，并且需要将该数字/数字递增到字符串末尾 +1 我会得到下面的字符串 string2 = suppose_name_1 string3
java - (数字/数字)的正则表达式
我正在寻找一个正则表达式 (数字/数字)，如(1/2) 数字必须是 1-3 位数字。我使用 Java。我认为我的问题比正则表达式更深。我无法让这个工作 String s ="(1/15)";
typescript [数字，数字]与数字[]
谁能帮我理解为什么我在使用以下代码时会出现类型错误: function sumOfTwoNumbersInArray(a: [number, number]) { return a[0] +
google-apps-script - Apps 脚本错误 : Cannot find method getRange(number, 数字、数字、数字)
我看到有些人过去也遇到过类似的问题，但他们似乎只是不同，所以解决方案也有所不同。所以这里是: 我正在尝试在 Google Apps 脚本中返回工作表的已知尺寸范围，如下所示: var myRange
Python - 如何将此模式(数字/数字)与正则表达式匹配？
我试图了解python中的正则表达式模块。我试图让我的程序从用户输入的一行文本中匹配以下模式: 8-13 之间的数字“/” 0-15 之间的数字例如:8/2、11/13、10/9 等。我想出的模式
java - 如何将扫描仪输入拆分为(数字)(带空格的字符串)(数字)
简单地说，我当前正在开发的程序要求我拆分扫描仪输入(例如:2 个火腿和奶酪 5.5)。它应该读取杂货订单并将其分成三个数组。我应该使用 string.split 并能够将此输入分成三部分，而不管中间字
c++ - (数字)和(-数字)的含义
(number) & (-number) 是什么意思？我已经搜索过了，但无法找到含义我想在 for 循环中使用 i & (-i)，例如: for (i = 0; i 110000 .对于i没有高于
javascript - 数字 = parseInt(数字);需要从 rel 属性中获取非数字
需要将图像ID设置为数字 var number = $(this).attr('rel'); number = parseInt(number); $('#carousel .slid
typescript - Typescript 可以确保数组具有重复的类型模式吗？例如[字符串，数字，字符串，数字，....(永远)]
我有一个函数，我想确保它接受一个字符串，后跟一个数字。并且可选地，更多的字符串数字对。就像一个元组，但“无限”次: const fn = (...args: [string, number] | [s
javascript - html 输入类型更改=数字 "available"值。还将更改另一个输入类型=数字 "Total"
我想复制“可用”输入数字的更改并将其添加或减去到“总计”中如果此人将“可用”更改为“3”，则“总计”将变为“9”。如果用户将“可用”更改为“5”，则“总计”将变为“11”。 $('#id1').b
r - 如何在 R 中的(字符/数字)和(字符/数字)类型之间进行换行
我有一个与 R 中的断线相关的简单问题。我正在尝试粘贴，但在获取(字符/数字)之间的断线时遇到问题。请注意，这些值包含在向量中(V1=81,V2=55,V3=25)我已经尝试过这段代码: cat(p
c++ - 数字 xor K - K = 数字 + K xor K，为什么？
很难说出这里问的是什么。这个问题是含糊的、模糊的、不完整的、过于宽泛的或修辞性的，无法以目前的形式得到合理的回答。如需帮助澄清此问题以便重新打开它，visit the help center 。已关
angular - typescript 错误 "Argument of type ' 数字[ ]' is not assignable to parameter of type ' 数字'”
我在 Typescript 中收到以下错误: Argument of type 'number[]' is not assignable to parameter of type 'number' 我
JavaScript 数字
在本教程中，您将通过示例了解JavaScript 数字。在JavaScript中，数字是基本数据类型。例如， const a = 3; const b = 3.13; 与其他一些编程语言不同
JavaScript 数字
我在 MDN Reintroduction to JavaScript 上阅读JavaScript 数字只是浮点精度类型，JavaScript 中没有整数。然而 JavaScript 有两个函数，pa
Excel编程自动完成部分输入(数字)
我们在 Excel 中管理库存。我知道这有点过时，但我们正在发展商业公司，我们所有的钱都被困在业务上，没有钱投资 IT。所以我想知道我可以用Excel自动完成产品编号的方式进行编程吗？这是一个产品

行者123

个人简介

我是一名优秀的程序员,十分优秀！

作者热门文章

滴滴打车优惠券免费领取

全站热门文章

首页

博学

6Ren·AI

商城

Python机器学习数字识别