Convolution neural network learning notes (taking handwritten numeral recognition as an example)

1, Several knowledge points

1. Difference between convolutional neural network and artificial neural network

        The traditional artificial neural network has only input layer, hidden layer and output layer. The number of hidden layers depends on the needs. The construction steps are: feature mapping to value, and feature selection is manual. Based on the original multilayer neural network, the convolution neural network adds the feature learning part, that is, the partially connected convolution layer and pooling layer are added in front of the original full connection layer. The construction steps are: signal - > feature - > value, and the features are selected by the network itself.

2. Basic composition of convolutional neural network

Convolution layer: convolution is the inner product of the local data in the picture and the convolution kernel. It is to extract the local features of the picture. Convolution kernel, also known as filter, is actually a group of neurons with fixed weight, which are used to extract specific features. The extracted features are generally called feature map. The convolution layer is formed by the superposition of multiple filters.

Each filter has a set of weights. After a filter slides to a position, it calculates convolution, sums, and finally adds bias to get the final result of the filter at that position.

Activation function: as mentioned in the previous section, there is an activation function behind the convolution layer, and the output of this layer is after passing the activation function.

Pooling layer: the pooling layer can effectively reduce the size of the parameter matrix, so as to reduce the number of parameters in the final connection layer. Common pooling methods are: average pooling - calculate the average value of the image area as the pooled value of the area. max pooling - select the maximum value of the image area as the pooled value of the area. There are also activation functions after the pooling layer.

The number of convolution layers and pooling layers shall be determined as required.

Full connection layer: the full connection layer is at the tail of convolutional neural network, which is the same as the connection mode of artificial neural network and plays the role of "Classifier".

3. Stochastic Gradient Descent (SGD) optimization method with Momentum parameter

2, Using tensorflow and keras to build a convolutional neural network framework

The code is as follows:

from numpy import mean
from numpy import std
from matplotlib import pyplot
from sklearn.model_selection import KFold
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras import utils
from tensorflow.keras.utils import to_categorical
from tensorflow.keras.layers import Conv2D
from tensorflow.keras.layers import MaxPooling2D
from tensorflow.keras.layers import Flatten
from tensorflow.keras.optimizers import SGD

#load tarin & test dataset
def load_dataset():
    (X_train, y_train), (X_test, y_test) = tf.keras.datasets.mnist.load_data()
    #reshape dataset to have a single channel
    X_train = X_train.reshape((X_train.shape[0], 28, 28, 1))
    X_test = X_test.reshape((X_test.shape[0], 28, 28, 1))

    #one hot representation
    y_train = to_categorical(y_train)
    y_test = to_categorical(y_test)

    return X_train, y_train, X_test, y_test

#scale pixels
def pre_pixels(train, test):
    #convert from integers to floats
    train_norm = train.astype('float32')
    test_norm = test.astype('float32')

    # normalize inputs from 0-255 to 0-1
    train_norm = train_norm / 255.0
    test_norm = test_norm / 255.0

    return train_norm, test_norm

#define a model
def define_model():
    model = Sequential()
    #add a convolutional layer
    model.add(Conv2D(8, (3,3), activation = 'relu', kernel_initializer = 'he_uniform', input_shape = (28,28,1)))
    #add a pooling layer
    #the number of output for each layer = (input - kernel + 2 * padding) / stride + 1
    #add a hidden layer
    model.add(Dense(120, activation = 'relu', kernel_initializer = 'he_uniform'))
    #add a output layer
    model.add(Dense(10, activation = 'softmax'))

    #comoile the model
    opt = SGD(lr = 0.01, momentum = 0.9)
    model.compile(optimizer = opt, loss = 'categorical_crossentropy', metrics = ['accuracy'])
    return model

#evaluate a model using k-fols cross-validation
def evaluate_model(dataX, dataY, n_folds = 5):
    scores, histories = list(), list()
    #prepare cross validation
    kfold = KFold(n_folds, shuffle = True, random_state = 1)

    for train_ix, test_ix in kfold.split(dataX):
        model = define_model()
        train_x, train_y, test_x, test_y = dataX[train_ix], dataY[train_ix], dataX[test_ix], dataY[test_ix]
        history =, train_y, epochs = 10, batch_size = 60, validation_data = (test_x, test_y), verbose = 0)
        #evaluate model
        _, acc = model.evaluate(test_x, test_y, verbose = 0)
        print('> acc: %.3f' % (acc * 100.0))
        #stores scores
    print('scores', scores)
    print('histories,len', len(histories))
    return scores, histories
#plot diagnostic learning curves
def summarize_diagnostics(histories):
    for i in range(len(histories)):
        #plot loss
        pyplot.title('Cross Entropy Loss')
        pyplot.plot(histories[i].history['loss'], color = 'blue', label = 'train')
        pyplot.plot(histories[i].history['val_loss'], color = 'orange', label = 'test')
        pyplot.legend(['train', 'test'], loc = 'upper right')

        #plot accuracy
        pyplot.subplot(2, 1, 2)
        pyplot.title('Classification Accuracy')
        pyplot.plot(histories[i].history['accuracy'], color='blue', label='train')
        pyplot.plot(histories[i].history['val_accuracy'], color='orange', label='test')
        pyplot.legend(['train', 'test'], loc='upper right')
#summarize performance of the model
def summarize_performance(scores):
    #print summary
    print('Accuracy: mean = %.3f std = %.3f, n=%d' % (mean(scores) * 100, std(scores) * 100, len(scores)))

#run the test harness for evaluating amodel
def run_mymodel_test():
    #load dataset
    X_train, y_train, X_test, y_test = load_dataset()
    #scale pixels
    X_train, X_test = pre_pixels(X_train, X_test)
    #evaluate a model
    scores, histories = evaluate_model(X_train, y_train)
    # plot diagnostic learning curves
    # summarize performance of the model

#Main program entry

Operation results:

  3, Problems encountered and Solutions

At the beginning of running this code, the following errors are reported:

tensorflow.python.framework.errors_impl.UnknownError:  Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.

  The solution provided online is to add the following code at the beginning of the program:

physical_devices = tf.config.experimental.list_physical_devices('GPU')
assert len(physical_devices) > 0, "Not enough GPU hardware devices available"
tf.config.experimental.set_memory_growth(physical_devices[0], True)

When I joined, the error report became:

ValueError: Memory growth cannot differ between GPU devices

The solution provided online is to add the following code at the beginning of the program:

import os
os.environ['CUDA_VISIBLE_DEVICES'] = '0'

After running, the error reported above disappeared, and it changed back to the error at the beginning... I don't know why I went around, and then... Returned to the original starting point.

I suddenly realized that this would not work. After all, everyone's version and configuration may be different. I went to look at the error information. Sure enough, I found such a line:

Loaded runtime CuDNN library: 7.1.4 but source was compiled with: 7.6.0.  CuDNN library major and minor version needs to match or have higher minor version in case of CuDNN 7.0 or later version. If using a binary install, upgrade your CuDNN library.  If building from sources, make sure the library loaded at runtime is compatible with the version specified during compile configuration.

Is Cudnn version incompatible? I'm going to find a way to upgrade Cudnn. I found the way, but I didn't use it because I found this: [cudnn error resolution] loaded runtime cudnn Library: 7.0.5, but source was compiled with: 7.2.1 Fortunately, I didn't rush to upgrade Cudnn first. If I didn't solve it, I wouldn't have a day. Since the main way is to reduce the version of Tensorflow, I directly use the virtual environment with Tensorflow 1.12.0,

  Cudnn doesn't upgrade. It's solved perfectly in the end.

Tags: Python Deep Learning Convolutional Neural Networks

Posted on Thu, 14 Oct 2021 18:42:42 -0400 by flyankur