Pro Membership

Pro Membership is a membership plan created by the author to maintain and update this tutorial. You can get more benefits and services, click to view details.

49. Handwritten Character Recognition Neural Network#

49.1. Introduction#

In the previous experiment, we introduced the principle of artificial neural networks starting from perceptrons and elaborated in detail on the process of backpropagation in neural networks through theoretical derivations. In this challenge, we will combine the implementation method of artificial neural networks provided by scikit-learn to complete handwritten character recognition.

49.2. Key Points#

Artificial Neural Network
Handwritten Character Recognition

49.3. Overview of Handwritten Character Datasets#

In this challenge, we will use the handwritten character dataset DIGITS. The full name of this dataset is Pen-Based Recognition of Handwritten Digits Data Set, which is sourced from the UCI Open Datasets website.

The dataset contains digital matrices converted from 1,797 handwritten character images of the digits 0 to 9, and the target values are 0 - 9. For convenience, here we directly use the load_digits method provided by scikit-learn to load this dataset.

                          import numpy as np
from sklearn import datasets

digits = datasets.load_digits()

The loaded DIGITS dataset contains three attributes:

Attribute	Description
`images`	8x8 matrix recording the pixel grayscale values corresponding to each handwritten character image
`data`	Converting the corresponding 8x8 matrix of `images` into a row vector
`target`	Recording the digits represented by each of the 1,797 images

Next, we output the first handwritten character for viewing.

The digit corresponding to the first character image:

                          digits.target[0]

                        

The grayscale value matrix corresponding to the first character image:

                          digits.images[0]

                        

                          array([[ 0.,  0.,  5., 13.,  9.,  1.,  0.,  0.],
       [ 0.,  0., 13., 15., 10., 15.,  5.,  0.],
       [ 0.,  3., 15.,  2.,  0., 11.,  8.,  0.],
       [ 0.,  4., 12.,  0.,  0.,  8.,  8.,  0.],
       [ 0.,  5.,  8.,  0.,  0.,  9.,  8.,  0.],
       [ 0.,  4., 11.,  0.,  1., 12.,  7.,  0.],
       [ 0.,  2., 14.,  5., 10., 12.,  0.,  0.],
       [ 0.,  0.,  6., 13., 10.,  0.,  0.,  0.]])

                        

Flatten the matrix into a row vector:

                          digits.data[0]

                        

                          array([ 0.,  0.,  5., 13.,  9.,  1.,  0.,  0.,  0.,  0., 13., 15., 10.,
       15.,  5.,  0.,  0.,  3., 15.,  2.,  0., 11.,  8.,  0.,  0.,  4.,
       12.,  0.,  0.,  8.,  8.,  0.,  0.,  5.,  8.,  0.,  0.,  9.,  8.,
        0.,  0.,  4., 11.,  0.,  1., 12.,  7.,  0.,  0.,  2., 14.,  5.,
       10., 12.,  0.,  0.,  0.,  0.,  6., 13., 10.,  0.,  0.,  0.])

                        

You may feel that numbers are always not very intuitive. Then, based on the grayscale value matrix, we can use Matplotlib to draw the grayscale image corresponding to the character.

                          from matplotlib import pyplot as plt

%matplotlib inline

image1 = digits.images[0]
plt.imshow(image1, cmap=plt.cm.gray_r)

<matplotlib.image.AxesImage at 0x146b52380>

../_images/3ccacb62d17e94a5b5d2a5925bb83b8b045014fdf40ee015a424df9cad3d6c2e.png

It can be clearly seen from the above image that it is the handwritten character 0.

                      {exercise-start}
:label: chapter06_03_1

Challenge: Use the \(1 \times 5\) subplot style to draw the images of the first 5 handwritten characters in the Digits dataset.

                      {exercise-end}

                    

                          ## 代码开始 ### (3～5 行代码)

## 代码结束 ###

                      {solution-start} chapter06_03_1
:class: dropdown

                      ### Code start ### (3 - 5 lines of code)
fig, axes = plt.subplots(1, 5, figsize=(12,4))
for i, image in enumerate(digits.images[:5]):
    axes[i].imshow(image, cmap=plt.cm.gray_r)
### Code end ###

                    

                      {solution-end}

                    

Expected output

Next, we need to randomly split the dataset into a training set and a test set for future use.

                      {exercise-start}
:label: chapter06_03_2

Challenge: Use train_test_split() to split the dataset into 80% (training set) and 20% (test set).

Requirement: Define the training set features, training set targets, test set features, and test set targets as X_train, y_train, X_test, y_test respectively, and set the random seed to 30.

                      {exercise-end}

                    

                          ## 代码开始 ### (≈ 2 行代码)

## 代码结束 ###

                      {solution-start} chapter06_03_2
:class: dropdown

                      ### Start of code ### (≈ 2 lines of code)
from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(digits.data, digits.target, test_size=0.2, random_state=30)
### End of code ###

                      {solution-end}

                    

Run the tests

                          len(X_train), len(y_train), len(X_test), len(y_test), np.mean(y_test[5:13])

                        

Expected output

(1437, 1437, 360, 360, 3.75)

49.4. Building an Artificial Neural Network with scikit-learn#

The MLPClassifier() class in scikit-learn implements a multi-layer neural network structure with the backpropagation algorithm.

sklearn.neural_network.MLPClassifier(hidden_layer_sizes=(100, ), activation='relu', solver='adam', alpha=0.0001, batch_size='auto', learning_rate='constant', learning_rate_init=0.001, power_t=0.5, max_iter=200, shuffle=True, random_state=None, tol=0.0001, verbose=False, warm_start=False, momentum=0.9, nesterovs_momentum=True, early_stopping=False, validation_fraction=0.1, beta_1=0.9, beta_2=0.999, epsilon=1e-08)

This class has many parameters. We introduce the main parameters as follows:

- hidden_layer_sizes: Defines the number of hidden layers and the number of neurons they contain. (20, 20) represents 2 hidden layers with 20 neurons each.
- activation: The activation function, with options including identity (linear), logistic, tanh, relu.
- solver: The solution method, with options including lbfgs (quasi - Newton method), sgd (stochastic gradient descent), adam (improved sgd). Adam works well on relatively large datasets (thousands of samples). For small datasets, lbfgs converges faster and has good results.
- alpha: The regularization term parameter.
- learning_rate: The learning rate adjustment strategy, with options including constant (unchanged), invscaling (gradually decreasing), adaptive (adaptive).
- learning_rate_init: The initial learning rate, used to update the weights during stochastic gradient descent.
- max_iter: The maximum number of iterations.
- shuffle: Determines whether to reshuffle the samples at each iteration.
- random_state: The random number seed.
- tol: The tolerance for optimization solution. When the difference in loss between two iterations is less than this tolerance, the model considers it to have converged and training stops.

Next, we are going to build a neural network prediction model using MLPClassifier().

Exercise 49.1

Challenge: Use MLPClassifier() to build a neural network structure, train a handwritten character recognition model, and finally obtain the prediction accuracy on the test set.

Requirements:

The neural network structure contains 2 hidden layers, with 100 and 50 neurons respectively.
Use relu as the activation function.
Use the stochastic gradient descent SGD method for solution.
The learning rate is 0.02 and remains unchanged during the learning process.
The maximum number of iterations is 100 times.
Set the random number seed to 1.
Use default values for the remaining parameters.

                          from sklearn.neural_network import MLPClassifier
from sklearn.metrics import accuracy_score


def mpl():
    """
    参数:无

    返回:
    model -- 人工神经网络模型
    score -- 测试集上的预测准确率
    """
    ### 代码开始 ### (≈ 2 行代码)
    model = None

    score = None
    ### 代码结束 ###
    return model, score

                        

Solution to Exercise 49.1

                          from sklearn.neural_network import MLPClassifier
from sklearn.metrics import accuracy_score

def mpl():
    """
    Parameters: None

    Returns:
    model -- The artificial neural network model
    score -- The prediction accuracy on the test set
    """
    ### Code starts ### (≈ 2 lines of code)
    model = MLPClassifier(
        hidden_layer_sizes=(100, 50),
        activation='relu',
        solver='sgd',
        learning_rate_init=0.02,
        learning_rate='constant',
        max_iter=100,
        random_state=1
    )

    model.fit(X_train, y_train)
    score = accuracy_score(y_test, model.predict(X_test))
    ### Code ends ###
    return model, score

                        

Run the test

                          mpl()[1]

                        

Expected output

> 0.95

The general accuracy of the neural network trained with the above parameters can reach about 98%. We can output some attributes of the model, such as the number of iterations and plot the curve of the change of the loss during iteration.

                          # 绘制损失变化曲线
model = mpl()[0]
plt.plot(model.loss_curve_)

                        

                          # 输出模型达到收敛的迭代次数
model.n_iter_

○ Sharethis article link to your social media, blog, forum, etc. More external links will increase the search engine ranking of this site.

If you find this content helpful, you can buy me a coffee

48. Perceptron and Artificial Neural Network

50. Basic Concepts and Syntax of TensorFlow