catalogue
1, Training CNN convolutional neural network
5. Build CNN convolutional neural network
5-1. First layer: the first convolution layer
5-2. Second layer: second convolution layer
5-4. The third layer: the first fully connected layer
5-5. Layer 4: the second full connection layer (output layer)
2, Recognize your own handwritten digits (images)
3. Load the digital picture written by yourself and set the size
5. Turn to white on black background, data normalization
6. Convert to four-dimensional data
Basic theory
First layer: convolution layer.
The second layer: convolution layer.
The third layer: full connection layer.
Layer 4: output layer.
The picture of the original handwritten numeral in the picture is a 28 × 28, and it is black and white, so the number of channels of the picture is 1 and the input data is 28 × twenty-eight × 1. If it is a color picture, the number of channels of the picture is 3.
The network structure is a 4-layer convolution neural network (when calculating the number of neural network layers, the one with weight is regarded as one layer, and the pooling layer cannot be counted as one layer alone) (the pooling calculation is carried out in the convolution layer).
Convolution of multiple feature maps is equivalent to feature extraction of multiple feature maps at the same time.
The more the number of feature maps is, the more the number of features extracted by the convolution network is. If the number of feature maps is set too small, it is prone to under fitting. If the number of feature maps is set too much, it is prone to over fitting, so it needs to be set to an appropriate value.
1, Training CNN convolutional neural network
1. Load data
# 1. Load data mnist = tf.keras.datasets.mnist (train_data, train_target), (test_data, test_target) = mnist.load_data()
2. Change data dimension
Note: in TensorFlow, the data needs to be changed into a 4-dimensional format when convolution is performed. The four dimensions are: data quantity, picture height, picture width and picture channel number.
# 2. Change data dimension train_data = train_data.reshape(-1, 28, 28, 1) test_data = test_data.reshape(-1, 28, 28, 1) # Note: in TensorFlow, it is necessary to change the data into a 4-dimensional format when convoluting # The four dimensions are: data quantity, picture height, picture width and picture channel number
3. Normalization
# 3. Normalization (helps speed up training) train_data = train_data/255.0 test_data = test_data/255.0
4. Unique heat coding
# 4. Unique heat coding train_target = tf.keras.utils.to_categorical(train_target, num_classes=10) test_target = tf.keras.utils.to_categorical(test_target, num_classes=10) #10 results
5. Build CNN convolutional neural network
model = Sequential()
5-1. First layer: the first convolution layer
The first convolution layer: convolution layer + pooling layer.
# 5-1. First layer: convolution layer + pooling layer # First convolution model.add(Convolution2D(input_shape = (28,28,1), filters = 32, kernel_size = 5, strides = 1, padding = 'same', activation = 'relu')) # Convolution layer input data filter number convolution kernel size step filling data (same padding) activation function # First pool layer # pool_size model.add(MaxPooling2D(pool_size = 2, strides = 2, padding = 'same',)) # Pool layer (maximum pool) pool window size step filling method
5-2. Second layer: second convolution layer
# 5-2. Second layer: convolution layer + pooling layer # Second convolution model.add(Convolution2D(64, 5, strides=1, padding='same', activation='relu')) # 64: number of filters 5: convolution window size # Second pool layer model.add(MaxPooling2D(2, 2, 'same'))
5-3. Flattening
Change (64,7,7,64) data into: (64,7 * 7 * 64).
flatten flattening:
# 5-3. Flattening (equivalent to (64,7,7,64) data - > (64,7 * 7 * 64)) model.add(Flatten())
5-4. The third layer: the first fully connected layer
# 5-4. The third layer: the first fully connected layer model.add(Dense(1024,activation = 'relu')) model.add(Dropout(0.5))
5-5. Layer 4: the second full connection layer (output layer)
# 5-5. Layer 4: the second full connection layer (output layer) model.add(Dense(10, activation='softmax')) # 10: Number of output neurons
6. Compile
Set optimizer, loss function, label.
# 6. Compile model.compile(optimizer=Adam(lr=1e-4), loss='categorical_crossentropy', metrics=['accuracy']) # Optimizer (adam) loss function (cross entropy loss function) label
7. Training
# 7. Training model.fit(train_data, train_target, batch_size=64, epochs=10, validation_data=(test_data, test_target))
8. Save model
# 8. Save model model.save('mnist.h5')
effect:
Epoch 1/10
938/938 [==============================] - 142s 151ms/step - loss: 0.3319 - accuracy: 0.9055 - val_loss: 0.0895 - val_accuracy: 0.9728
Epoch 2/10
938/938 [==============================] - 158s 169ms/step - loss: 0.0911 - accuracy: 0.9721 - val_loss: 0.0515 - val_accuracy: 0.9830
Epoch 3/10
938/938 [==============================] - 146s 156ms/step - loss: 0.0629 - accuracy: 0.9807 - val_loss: 0.0389 - val_accuracy: 0.9874
Epoch 4/10
938/938 [==============================] - 120s 128ms/step - loss: 0.0498 - accuracy: 0.9848 - val_loss: 0.0337 - val_accuracy: 0.9889
Epoch 5/10
938/938 [==============================] - 119s 127ms/step - loss: 0.0424 - accuracy: 0.9869 - val_loss: 0.0273 - val_accuracy: 0.9898
Epoch 6/10
938/938 [==============================] - 129s 138ms/step - loss: 0.0338 - accuracy: 0.9897 - val_loss: 0.0270 - val_accuracy: 0.9907
Epoch 7/10
938/938 [==============================] - 124s 133ms/step - loss: 0.0302 - accuracy: 0.9904 - val_loss: 0.0234 - val_accuracy: 0.9917
Epoch 8/10
938/938 [==============================] - 132s 140ms/step - loss: 0.0264 - accuracy: 0.9916 - val_loss: 0.0240 - val_accuracy: 0.9913
Epoch 9/10
938/938 [==============================] - 139s 148ms/step - loss: 0.0233 - accuracy: 0.9926 - val_loss: 0.0235 - val_accuracy: 0.9919
Epoch 10/10
938/938 [==============================] - 139s 148ms/step - loss: 0.0208 - accuracy: 0.9937 - val_loss: 0.0215 - val_accuracy: 0.9924
It can be found that after 10 times of training, the effect has reached 99% +, which is still quite good.
code
# Handwritten numeral recognition -- CNN neural network training import os os.environ['TF_CPP_MIN_LOG_LEVEL']='2' import tensorflow as tf from tensorflow.keras.models import Sequential from tensorflow.keras.layers import Dense,Dropout,Convolution2D,MaxPooling2D,Flatten from tensorflow.keras.optimizers import Adam # 1. Load data mnist = tf.keras.datasets.mnist (train_data, train_target), (test_data, test_target) = mnist.load_data() # 2. Change data dimension train_data = train_data.reshape(-1, 28, 28, 1) test_data = test_data.reshape(-1, 28, 28, 1) # Note: in TensorFlow, it is necessary to change the data into a 4-dimensional format when convoluting # The four dimensions are: data quantity, picture height, picture width and picture channel number # 3. Normalization (helps speed up training) train_data = train_data/255.0 test_data = test_data/255.0 # 4. Unique heat coding train_target = tf.keras.utils.to_categorical(train_target, num_classes=10) test_target = tf.keras.utils.to_categorical(test_target, num_classes=10) #10 results # 5. Build CNN convolutional neural network model = Sequential() # 5-1. First layer: convolution layer + pooling layer # First convolution model.add(Convolution2D(input_shape = (28,28,1), filters = 32, kernel_size = 5, strides = 1, padding = 'same', activation = 'relu')) # Convolution layer input data filter number convolution kernel size step filling data (same padding) activation function # First pool layer # pool_size model.add(MaxPooling2D(pool_size = 2, strides = 2, padding = 'same',)) # Pool layer (maximum pool) pool window size step filling method # 5-2. Second layer: convolution layer + pooling layer # Second convolution model.add(Convolution2D(64, 5, strides=1, padding='same', activation='relu')) # 64: number of filters 5: convolution window size # Second pool layer model.add(MaxPooling2D(2, 2, 'same')) # 5-3. Flattening (equivalent to (64,7,7,64) data - > (64,7 * 7 * 64)) model.add(Flatten()) # 5-4. The third layer: the first fully connected layer model.add(Dense(1024, activation = 'relu')) model.add(Dropout(0.5)) # 5-5. Layer 4: the second full connection layer (output layer) model.add(Dense(10, activation='softmax')) # 10: Number of output neurons # 6. Compile model.compile(optimizer=Adam(lr=1e-4), loss='categorical_crossentropy', metrics=['accuracy']) # Optimizer (adam) loss function (cross entropy loss function) label # 7. Training model.fit(train_data, train_target, batch_size=64, epochs=10, validation_data=(test_data, test_target)) # 8. Save model model.save('mnist.h5')
2, Recognize your own handwritten digits (images)
1. Load data
# 1. Load data mnist = tf.keras.datasets.mnist (x_train, y_train), (x_test, y_test) = mnist.load_data()
Pictures of dataset (one):
2. Load the trained model
# 2. Load the trained model model = load_model('mnist.h5')
3. Load the digital picture written by yourself and set the size
# 3. Load the digital picture written by yourself and set the size img = Image.open('6.jpg') # Set the size (consistent with the picture of the dataset) img = img.resize((28, 28))
4. To grayscale image
# 4. To grayscale image gray = np.array(img.convert('L')) #. convert('L '): convert to grayscale image
It can be found that it is very different from the black words on a white background in the dataset, so let's reverse it:
5. Turn to white on black background, data normalization
The data in MNIST dataset are white words with black background, and the value is between 0 and 1.
# 5. Turn to white on black background, data normalization gray_inv = (255-gray)/255.0
6. Convert to four-dimensional data
CNN neural network prediction requires four-dimensional data.
# 6. Turn to four-dimensional data (required for CNN prediction) image = gray_inv.reshape((1,28,28,1))
7. Forecast
# 7. Forecast prediction = model.predict(image) # forecast prediction = np.argmax(prediction,axis=1) # Find the maximum print('Forecast results:', prediction)
8. Display image
# 8. Show # Set plt chart f, ax = plt.subplots(3, 3, figsize=(7, 7)) # Display dataset image ax[0][0].set_title('train_model') ax[0][0].axis('off') ax[0][0].imshow(x_train[18], 'gray') # Show original ax[0][1].set_title('img') ax[0][1].axis('off') ax[0][1].imshow(img, 'gray') # Display grayscale image (black words on white background) ax[0][2].set_title('gray') ax[0][2].axis('off') ax[0][2].imshow(gray, 'gray') # Display grayscale image (white on black background) ax[1][0].set_title('gray') ax[1][0].axis('off') ax[1][0].imshow(gray_inv, 'gray') plt.show()
Effect display
code
# Recognize your own handwritten digits (image prediction) import os os.environ['TF_CPP_MIN_LOG_LEVEL']='2' import tensorflow as tf from tensorflow.keras.models import load_model import matplotlib.pyplot as plt from PIL import Image import numpy as np # 1. Load data mnist = tf.keras.datasets.mnist (x_train, y_train), (x_test, y_test) = mnist.load_data() # 2. Load the trained model model = load_model('mnist.h5') # 3. Load the digital picture written by yourself and set the size img = Image.open('5.jpg') # Set the size (consistent with the picture of the dataset) img = img.resize((28, 28)) # 4. To grayscale image gray = np.array(img.convert('L')) #. convert('L '): convert to grayscale image # 5. Turn to white on black background, data normalization gray_inv = (255-gray)/255.0 # 6. Turn to four-dimensional data (required for CNN prediction) image = gray_inv.reshape((1,28,28,1)) # 7. Forecast prediction = model.predict(image) # forecast prediction = np.argmax(prediction,axis=1) # Find the maximum print('Forecast results:', prediction) # 8. Show # Set plt chart f, ax = plt.subplots(2, 2, figsize=(5, 5)) # Display dataset image ax[0][0].set_title('train_model') ax[0][0].axis('off') ax[0][0].imshow(x_train[18], 'gray') # Show original ax[0][1].set_title('img') ax[0][1].axis('off') ax[0][1].imshow(img, 'gray') # Display grayscale image (black words on white background) ax[1][0].set_title('gray') ax[1][0].axis('off') ax[1][0].imshow(gray, 'gray') # Display grayscale image (white on black background) ax[1][1].set_title(f'predict:{prediction}') ax[1][1].axis('off') ax[1][1].imshow(gray_inv, 'gray') plt.show()