100 cases of deep learning - convolutional neural network (VGG-16) to identify the pirate king straw hat group | day 6

1, Preliminary work

This paper will realize the recognition of personas in the pirate king.

🚀 My environment:

  • Locale: Python 3.6.5
  • Compiler: Jupiter notebook
  • Deep learning environment: tensorflow 2.4.1
  • Graphics card (GPU): NVIDIA GeForce RTX 3080

🚀 From column: 100 cases of deep learning

If you are an in-depth learning Xiaobai, you can take a look at my column written specifically for you: Introduction to Xiaobai deep learning

  1. Xiaobai introduction to in-depth learning Chapter 1: configuring in-depth learning environment
  2. Introduction to Xiaobai deep learning | Chapter 2: use of compiler - Jupiter notebook
  3. Introduction to Xiaobai's in-depth learning Chapter 3: initial experience of in-depth learning
  4. Xiaobai's introduction to in-depth learning Chapter 4: configuring PyTorch environment

🚀 Previous Highlights - convolutional neural network:

  1. 100 cases of deep learning convolutional neural network (CNN) to realize mnist handwritten numeral recognition | day 1
  2. 100 cases of deep learning - convolutional neural network (CNN) color picture classification | day 2
  3. 100 cases of deep learning - convolutional neural network (CNN) garment image classification | day 3
  4. 100 cases of deep learning - convolutional neural network (CNN) flower recognition | day 4
  5. 100 cases of deep learning - convolutional neural network (CNN) weather recognition | day 5
  6. 100 cases of deep learning - convolutional neural network (VGG-16) to identify the pirate king straw hat group | day 6
  7. 100 cases of deep learning - convolutional neural network (VGG-19) to identify the characters in the spirit cage | day 7
  8. 100 cases of deep learning - convolutional neural network (ResNet-50) bird recognition | day 8
  9. 100 cases of deep learning - convolutional neural network (AlexNet) hand-in-hand teaching | day 11
  10. 100 cases of deep learning - convolutional neural network (CNN) identification verification code | day 12
  11. 100 cases of deep learning - convolutional neural network (perception V3) recognition of sign language | day 13
  12. 100 cases of deep learning - convolution neural network (Inception-ResNet-v2) recognition of traffic signs | day 14
  13. 100 cases of deep learning - convolutional neural network (CNN) for license plate recognition | day 15
  14. 100 cases of in-depth learning - convolutional neural network (CNN) to identify the Magic Baby Xiaozhi group | day 16
  15. 100 cases of deep learning - convolutional neural network (CNN) attention detection | day 17
  16. 100 cases of deep learning - convolutional neural network (VGG-16) cat and dog recognition | day 21
  17. 100 cases of deep learning - "Hello Word" in deep learning of convolutional neural network (LeNet-5) | day 22
  18. 100 cases of deep learning - convolutional neural network (CNN) 3D medical image recognition | day 23
  19. Deep learning 100 cases | day 24 convolutional neural network (Xception): animal recognition

🚀 Highlights of previous issues - cyclic neural network:

  1. 100 cases of deep learning - circular neural network (RNN) to achieve stock prediction | day 9
  2. 100 cases of deep learning - circular neural network (LSTM) to realize stock prediction | day 10

🚀 Highlights of previous periods - generate confrontation network:

  1. 100 cases of deep learning - generation confrontation network (GAN) handwritten numeral generation | day 18
  2. 100 cases of deep learning - generation countermeasure network (DCGAN) handwritten numeral generation | day 19
  3. 100 cases of deep learning - generation confrontation network (DCGAN) generation animation little sister | day 20

1. Set GPU

If you are using a CPU, you can ignore this step

import tensorflow as tf

gpus = tf.config.list_physical_devices("GPU")

if gpus:
    tf.config.experimental.set_memory_growth(gpus[0], True)  #Set the amount of GPU video memory and use it on demand
    tf.config.set_visible_devices([gpus[0]],"GPU")

2. Import data

import matplotlib.pyplot as plt
import os,PIL

# Set random seeds so that the results can be reproduced as much as possible
import numpy as np
np.random.seed(1)

# Set random seeds so that the results can be reproduced as much as possible
import tensorflow as tf
tf.random.set_seed(1)

from tensorflow import keras
from tensorflow.keras import layers,models

import pathlib
data_dir = "D:\jupyter notebook\DL-100-days\datasets\hzw_photos"

data_dir = pathlib.Path(data_dir)

3. View data

There are 7 characters in the data set, including Luffy, Sauron, Nami, usop, choba, Shanzhi and Robin

foldermeaningquantity
lufeiMonkey D Luffy117 sheets
suolongSauron90 sheets
nameiNami84 sheets
wusuopuUsop77 sheets
qiaobaJoba102 sheets
shanzhiYamaji47 sheets
luobinRobin105 sheets
image_count = len(list(data_dir.glob('*/*.png')))

print("The total number of pictures is:",image_count)
Total number of pictures: 621

2, Data preprocessing

1. Load data

Using image_ dataset_ from_ The directory method loads the data from the disk into tf.data.Dataset

batch_size = 32
img_height = 224
img_width = 224
"""
about image_dataset_from_directory()Please refer to the following article for details: https://mtyjkh.blog.csdn.net/article/details/117018789
"""
train_ds = tf.keras.preprocessing.image_dataset_from_directory(
    data_dir,
    validation_split=0.2,
    subset="training",
    seed=123,
    image_size=(img_height, img_width),
    batch_size=batch_size)
Found 621 files belonging to 7 classes.
Using 497 files for training.
"""
about image_dataset_from_directory()Please refer to the following article for details: https://mtyjkh.blog.csdn.net/article/details/117018789
"""
val_ds = tf.keras.preprocessing.image_dataset_from_directory(
    data_dir,
    validation_split=0.2,
    subset="validation",
    seed=123,
    image_size=(img_height, img_width),
    batch_size=batch_size)
Found 621 files belonging to 7 classes.
Using 124 files for validation.

We can use class_names the label of the output dataset. The labels will correspond to the directory name in alphabetical order.

class_names = train_ds.class_names
print(class_names)
['lufei', 'luobin', 'namei', 'qiaoba', 'shanzhi', 'suolong', 'wusuopu']

2. Visual data

plt.figure(figsize=(10, 5))  # The width of the figure is 10 and the height is 5

for images, labels in train_ds.take(1):
    for i in range(8):
        
        ax = plt.subplot(2, 4, i + 1)  

        plt.imshow(images[i].numpy().astype("uint8"))
        plt.title(class_names[labels[i]])
        
        plt.axis("off")

plt.imshow(images[1].numpy().astype("uint8"))
<matplotlib.image.AxesImage at 0x2adcea36ee0>

3. Recheck the data

for image_batch, labels_batch in train_ds:
    print(image_batch.shape)
    print(labels_batch.shape)
    break
(32, 224, 224, 3)
(32,)
  • Image_batch is the tensor of shape (32180180,3). This is a batch of 32 pictures in the shape of 180x180x3 (the last dimension refers to the color channel RGB).
  • Label_batch is the tensor of the shape (32,), and these labels correspond to 32 pictures

4. Configure dataset

  • shuffle(): scramble data. For a detailed description of this function, please refer to: https://zhuanlan.zhihu.com/p/42417456
  • prefetch(): prefetch data and speed up operation. For details, please refer to my previous two articles, which are explained in them.
  • cache(): cache data sets into memory to speed up operation
AUTOTUNE = tf.data.AUTOTUNE

train_ds = train_ds.cache().shuffle(1000).prefetch(buffer_size=AUTOTUNE)
val_ds = val_ds.cache().prefetch(buffer_size=AUTOTUNE)

5. Normalization

normalization_layer = layers.experimental.preprocessing.Rescaling(1./255)

normalization_train_ds = train_ds.map(lambda x, y: (normalization_layer(x), y))
val_ds = val_ds.map(lambda x, y: (normalization_layer(x), y))
image_batch, labels_batch = next(iter(val_ds))
first_image = image_batch[0]

# View normalized data
print(np.min(first_image), np.max(first_image))
0.0 0.9928046

3, Build VGG-16 network

Choose between the official model and the self built model. Select one and comment out the other. They are all genuine VGG-16.

VGG advantages and disadvantages analysis:

  • VGG benefits

The structure of VGG is very simple. The whole network uses the same convolution kernel size (3x3) and maximum pool size (2x2).

  • VGG disadvantages

1) The training time is too long and it is difficult to adjust parameters. 2) The required storage capacity is large, which is not conducive to deployment. For example, the size of VGG-16 weight value file is more than 500 MB, which is not conducive to installation in embedded system.

1. Official model (packaged)

The official website model call will be included in the following articles. Next, I will mainly talk about VGG-16

# model = keras.applications.VGG16()
# model.summary()

2. Self built model

from tensorflow.keras import layers, models, Input
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Dense, Flatten, Dropout

def VGG16(nb_classes, input_shape):
    input_tensor = Input(shape=input_shape)
    # 1st block
    x = Conv2D(64, (3,3), activation='relu', padding='same',name='block1_conv1')(input_tensor)
    x = Conv2D(64, (3,3), activation='relu', padding='same',name='block1_conv2')(x)
    x = MaxPooling2D((2,2), strides=(2,2), name = 'block1_pool')(x)
    # 2nd block
    x = Conv2D(128, (3,3), activation='relu', padding='same',name='block2_conv1')(x)
    x = Conv2D(128, (3,3), activation='relu', padding='same',name='block2_conv2')(x)
    x = MaxPooling2D((2,2), strides=(2,2), name = 'block2_pool')(x)
    # 3rd block
    x = Conv2D(256, (3,3), activation='relu', padding='same',name='block3_conv1')(x)
    x = Conv2D(256, (3,3), activation='relu', padding='same',name='block3_conv2')(x)
    x = Conv2D(256, (3,3), activation='relu', padding='same',name='block3_conv3')(x)
    x = MaxPooling2D((2,2), strides=(2,2), name = 'block3_pool')(x)
    # 4th block
    x = Conv2D(512, (3,3), activation='relu', padding='same',name='block4_conv1')(x)
    x = Conv2D(512, (3,3), activation='relu', padding='same',name='block4_conv2')(x)
    x = Conv2D(512, (3,3), activation='relu', padding='same',name='block4_conv3')(x)
    x = MaxPooling2D((2,2), strides=(2,2), name = 'block4_pool')(x)
    # 5th block
    x = Conv2D(512, (3,3), activation='relu', padding='same',name='block5_conv1')(x)
    x = Conv2D(512, (3,3), activation='relu', padding='same',name='block5_conv2')(x)
    x = Conv2D(512, (3,3), activation='relu', padding='same',name='block5_conv3')(x)
    x = MaxPooling2D((2,2), strides=(2,2), name = 'block5_pool')(x)
    # full connection
    x = Flatten()(x)
    x = Dense(4096, activation='relu',  name='fc1')(x)
    x = Dense(4096, activation='relu', name='fc2')(x)
    output_tensor = Dense(nb_classes, activation='softmax', name='predictions')(x)

    model = Model(input_tensor, output_tensor)
    return model

model=VGG16(1000, (img_width, img_height, 3))
model.summary()
Model: "model"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_1 (InputLayer)         [(None, 224, 224, 3)]     0         
_________________________________________________________________
block1_conv1 (Conv2D)        (None, 224, 224, 64)      1792      
_________________________________________________________________
block1_conv2 (Conv2D)        (None, 224, 224, 64)      36928     
_________________________________________________________________
block1_pool (MaxPooling2D)   (None, 112, 112, 64)      0         
_________________________________________________________________
block2_conv1 (Conv2D)        (None, 112, 112, 128)     73856     
_________________________________________________________________
block2_conv2 (Conv2D)        (None, 112, 112, 128)     147584    
_________________________________________________________________
block2_pool (MaxPooling2D)   (None, 56, 56, 128)       0         
_________________________________________________________________
block3_conv1 (Conv2D)        (None, 56, 56, 256)       295168    
_________________________________________________________________
block3_conv2 (Conv2D)        (None, 56, 56, 256)       590080    
_________________________________________________________________
block3_conv3 (Conv2D)        (None, 56, 56, 256)       590080    
_________________________________________________________________
block3_pool (MaxPooling2D)   (None, 28, 28, 256)       0         
_________________________________________________________________
block4_conv1 (Conv2D)        (None, 28, 28, 512)       1180160   
_________________________________________________________________
block4_conv2 (Conv2D)        (None, 28, 28, 512)       2359808   
_________________________________________________________________
block4_conv3 (Conv2D)        (None, 28, 28, 512)       2359808   
_________________________________________________________________
block4_pool (MaxPooling2D)   (None, 14, 14, 512)       0         
_________________________________________________________________
block5_conv1 (Conv2D)        (None, 14, 14, 512)       2359808   
_________________________________________________________________
block5_conv2 (Conv2D)        (None, 14, 14, 512)       2359808   
_________________________________________________________________
block5_conv3 (Conv2D)        (None, 14, 14, 512)       2359808   
_________________________________________________________________
block5_pool (MaxPooling2D)   (None, 7, 7, 512)         0         
_________________________________________________________________
flatten (Flatten)            (None, 25088)             0         
_________________________________________________________________
fc1 (Dense)                  (None, 4096)              102764544 
_________________________________________________________________
fc2 (Dense)                  (None, 4096)              16781312  
_________________________________________________________________
predictions (Dense)          (None, 1000)              4097000   
=================================================================
Total params: 138,357,544
Trainable params: 138,357,544
Non-trainable params: 0
_________________________________________________________________

3. Network structure diagram

For relevant knowledge about convolution, please refer to the article: https://mtyjkh.blog.csdn.net/article/details/114278995

Structure description:

  • 13 convolutional layers, respectively with blockX_convX indicates
  • Three fully connected layers are represented by fcX and predictions respectively
  • 5 pool layers, respectively with blockX_pool representation

VGG-16 contains 16 hidden layers (13 convolution layers and 3 full connection layers), so it is called VGG-16

4, Compile

Before preparing to train the model, you need to make some more settings. The following is added in the compilation step of the model:

  • Loss function (loss): used to measure the accuracy of the model during training.
  • optimizer: determines how the model updates based on the data it sees and its own loss function.
  • metrics: used to monitor training and test steps. The following example uses the accuracy rate, that is, the ratio of images correctly classified.
# Set optimizer
opt = tf.keras.optimizers.Adam(learning_rate=1e-4)

model.compile(optimizer=opt,
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
              metrics=['accuracy'])

5, Training model

epochs = 20

history = model.fit(
    train_ds,
    validation_data=val_ds,
    epochs=epochs
)
Epoch 1/20
16/16 [==============================] - 14s 461ms/step - loss: 4.5842 - accuracy: 0.1349 - val_loss: 6.8389 - val_accuracy: 0.1129
Epoch 2/20
16/16 [==============================] - 2s 146ms/step - loss: 2.1046 - accuracy: 0.1398 - val_loss: 6.7905 - val_accuracy: 0.2016
Epoch 3/20
16/16 [==============================] - 2s 144ms/step - loss: 1.7885 - accuracy: 0.3531 - val_loss: 6.7892 - val_accuracy: 0.2903
Epoch 4/20
16/16 [==============================] - 2s 145ms/step - loss: 1.2015 - accuracy: 0.6135 - val_loss: 6.7582 - val_accuracy: 0.2742
Epoch 5/20
16/16 [==============================] - 2s 148ms/step - loss: 1.1831 - accuracy: 0.6108 - val_loss: 6.7520 - val_accuracy: 0.4113
Epoch 6/20
16/16 [==============================] - 2s 143ms/step - loss: 0.5140 - accuracy: 0.8326 - val_loss: 6.7102 - val_accuracy: 0.5806
Epoch 7/20
16/16 [==============================] - 2s 150ms/step - loss: 0.2451 - accuracy: 0.9165 - val_loss: 6.6918 - val_accuracy: 0.7823
Epoch 8/20
16/16 [==============================] - 2s 147ms/step - loss: 0.2156 - accuracy: 0.9328 - val_loss: 6.7188 - val_accuracy: 0.4113
Epoch 9/20
16/16 [==============================] - 2s 143ms/step - loss: 0.1940 - accuracy: 0.9513 - val_loss: 6.6639 - val_accuracy: 0.5968
Epoch 10/20
16/16 [==============================] - 2s 143ms/step - loss: 0.0767 - accuracy: 0.9812 - val_loss: 6.6101 - val_accuracy: 0.7419
Epoch 11/20
16/16 [==============================] - 2s 146ms/step - loss: 0.0245 - accuracy: 0.9894 - val_loss: 6.5526 - val_accuracy: 0.8226
Epoch 12/20
16/16 [==============================] - 2s 149ms/step - loss: 0.0387 - accuracy: 0.9861 - val_loss: 6.5636 - val_accuracy: 0.6210
Epoch 13/20
16/16 [==============================] - 2s 152ms/step - loss: 0.2146 - accuracy: 0.9289 - val_loss: 6.7039 - val_accuracy: 0.4839
Epoch 14/20
16/16 [==============================] - 2s 152ms/step - loss: 0.2566 - accuracy: 0.9087 - val_loss: 6.6852 - val_accuracy: 0.6532
Epoch 15/20
16/16 [==============================] - 2s 149ms/step - loss: 0.0579 - accuracy: 0.9840 - val_loss: 6.5971 - val_accuracy: 0.6935
Epoch 16/20
16/16 [==============================] - 2s 152ms/step - loss: 0.0414 - accuracy: 0.9866 - val_loss: 6.6049 - val_accuracy: 0.7581
Epoch 17/20
16/16 [==============================] - 2s 146ms/step - loss: 0.0907 - accuracy: 0.9689 - val_loss: 6.6476 - val_accuracy: 0.6452
Epoch 18/20
16/16 [==============================] - 2s 147ms/step - loss: 0.0929 - accuracy: 0.9685 - val_loss: 6.6590 - val_accuracy: 0.7903
Epoch 19/20
16/16 [==============================] - 2s 146ms/step - loss: 0.0364 - accuracy: 0.9935 - val_loss: 6.5915 - val_accuracy: 0.6290
Epoch 20/20
16/16 [==============================] - 2s 151ms/step - loss: 0.1081 - accuracy: 0.9662 - val_loss: 6.6541 - val_accuracy: 0.6613

6, Model evaluation

acc = history.history['accuracy']
val_acc = history.history['val_accuracy']

loss = history.history['loss']
val_loss = history.history['val_loss']

epochs_range = range(epochs)

plt.figure(figsize=(12, 4))
plt.subplot(1, 2, 1)
plt.plot(epochs_range, acc, label='Training Accuracy')
plt.plot(epochs_range, val_acc, label='Validation Accuracy')
plt.legend(loc='lower right')
plt.title('Training and Validation Accuracy')

plt.subplot(1, 2, 2)
plt.plot(epochs_range, loss, label='Training Loss')
plt.plot(epochs_range, val_loss, label='Validation Loss')
plt.legend(loc='upper right')
plt.title('Training and Validation Loss')
plt.show()

In order to reflect the original VGG-16, the model parameters are not modified in this paper. The correlation parameters in the model can be modified according to the actual situation to adapt to the actual situation in order to improve the classification effect.

Other highlights:

100 cases of deep learning column: [portal]

If you think this article is helpful to you, remember to pay attention, praise and add a collection

Tags: neural networks TensorFlow Deep Learning convolution

Posted on Fri, 10 Sep 2021 19:53:25 -0400 by Supplement