100 cases of deep learning - convolutional neural network (VGG-16) cat and dog recognition | day 21

With information, with information, wechat search [classmate K] pays attention to this blogger who shares dry goods.
this paper 🔥 GitHub https://github.com/kzbkzb/Python-AI It has been included, including Python, in-depth learning materials and my series of articles.

The update is a little slow recently. I received reminders from many small partners in the background. First say sorry. Recently, I participated in a target detection competition, and the time is tight. During this time, I also intend to adjust my thinking, try to split the content involved in target detection, and integrate these split content into the subsequent blog. On the one hand, it has both my current work and the update progress of the article. When it's natural, start writing about target detection.

In this article, I give up the previous model.fit() training method and use model. Train instead_ on_ Batch method. Comparison of the two methods:

  • model.fit(): it is very simple to use and very friendly to novices
  • model.train_on_batch(): lower encapsulation and more tricks can be played.

In addition, I also introduced the display mode of progress bar, which is more convenient for us to view the situation in the process of model training in time and print various indicators in time.

🚀 My environment:

  • Locale: Python 3.6.5
  • Compiler: Jupiter notebook
  • Deep learning environment: tensorflow 2.4.1
  • Graphics card (GPU): NVIDIA GeForce RTX 3080
  • Data and code: 📌 [portal]

🚀 This article is selected from the column: 100 cases of deep learning

🚀 In depth learning newcomers must see: Introduction to Xiaobai deep learning

  1. Xiaobai introduction to in-depth learning Chapter 1: configuring in-depth learning environment
  2. Introduction to Xiaobai deep learning | Chapter 2: use of compiler - Jupiter notebook
  3. Introduction to Xiaobai's in-depth learning Chapter 3: initial experience of in-depth learning

🚀 Previous Highlights - convolutional neural network:

  1. 100 cases of deep learning convolutional neural network (CNN) to realize mnist handwritten numeral recognition | day 1
  2. 100 cases of deep learning - convolutional neural network (CNN) color picture classification | day 2
  3. 100 cases of deep learning - convolutional neural network (CNN) garment image classification | day 3
  4. 100 cases of deep learning - convolutional neural network (CNN) flower recognition | day 4
  5. 100 cases of deep learning - convolutional neural network (CNN) weather recognition | day 5
  6. 100 cases of deep learning - convolutional neural network (VGG-16) to identify the pirate king straw hat group | day 6
  7. 100 cases of deep learning - convolutional neural network (VGG-19) to identify the characters in the spirit cage | day 7
  8. 100 cases of deep learning - convolutional neural network (ResNet-50) bird recognition | day 8
  9. 100 cases of deep learning - convolutional neural network (AlexNet) hand-in-hand teaching | day 11
  10. 100 cases of deep learning - convolutional neural network (CNN) identification verification code | day 12
  11. 100 cases of deep learning - convolutional neural network (perception V3) recognition of sign language | day 13
  12. 100 cases of deep learning - convolution neural network (Inception-ResNet-v2) recognition of traffic signs | day 14
  13. 100 cases of deep learning - convolutional neural network (CNN) for license plate recognition | day 15
  14. 100 cases of in-depth learning - convolutional neural network (CNN) to identify the Magic Baby Xiaozhi group | day 16
  15. 100 cases of deep learning - convolutional neural network (CNN) attention detection | day 17
  16. 100 cases of deep learning - "Hello Word" in deep learning of convolutional neural network (LeNet-5) | day 22
  17. 100 cases of deep learning - convolutional neural network (CNN) 3D medical image recognition | day 23

🚀 Highlights of previous issues - cyclic neural network:

  1. 100 cases of deep learning - circular neural network (RNN) to achieve stock prediction | day 9
  2. 100 cases of deep learning - circular neural network (LSTM) to realize stock prediction | day 10

🚀 Highlights of previous periods - generate confrontation network:

  1. 100 cases of deep learning - generation confrontation network (GAN) handwritten numeral generation | day 18
  2. 100 cases of deep learning - generation countermeasure network (DCGAN) handwritten numeral generation | day 19
  3. 100 cases of deep learning - generation confrontation network (DCGAN) generation animation little sister | day 20

1, Preliminary work

1. Set GPU

If you are using a CPU, you can comment out this part of the code.

import tensorflow as tf

gpus = tf.config.list_physical_devices("GPU")

if gpus:
    tf.config.experimental.set_memory_growth(gpus[0], True)  #Set the amount of GPU video memory and use it on demand
    tf.config.set_visible_devices([gpus[0]],"GPU")

# Print the graphics card information and confirm that the GPU is available
print(gpus)
[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]

2. Import data

import matplotlib.pyplot as plt
# Support Chinese
plt.rcParams['font.sans-serif'] = ['SimHei']  # Used to display Chinese labels normally
plt.rcParams['axes.unicode_minus'] = False  # Used to display negative signs normally

import os,PIL

# Set random seeds so that the results can be reproduced as much as possible
import numpy as np
np.random.seed(1)

# Set random seeds so that the results can be reproduced as much as possible
import tensorflow as tf
tf.random.set_seed(1)

#Hide warning
import warnings
warnings.filterwarnings('ignore')

import pathlib
data_dir = "./data/train"
# data_dir = "D:/jupyter notebook/DL-100-days/datasets/017_Eye_dataset"

data_dir = pathlib.Path(data_dir)

3. View data

image_count = len(list(data_dir.glob('*/*')))

print("The total number of pictures is:",image_count)
Total number of pictures: 3400

2, Data preprocessing

1. Load data

Using image_ dataset_ from_ The directory method loads the data from the disk into tf.data.Dataset

batch_size = 8
img_height = 224
img_width = 224

Students with TensorFlow version 2.2.0 may encounter module 'TensorFlow. Keras. Preprocessing' has no attribute 'image_ dataset_ from_ The error of 'directory' is reported. Just upgrade TensorFlow.

"""
about image_dataset_from_directory()Please refer to the following article for details: https://mtyjkh.blog.csdn.net/article/details/117018789
"""
train_ds = tf.keras.preprocessing.image_dataset_from_directory(
    data_dir,
    validation_split=0.2,
    subset="training",
    seed=12,
    image_size=(img_height, img_width),
    batch_size=batch_size)
Found 3400 files belonging to 2 classes.
Using 2720 files for training.
"""
about image_dataset_from_directory()Please refer to the following article for details: https://mtyjkh.blog.csdn.net/article/details/117018789
"""
val_ds = tf.keras.preprocessing.image_dataset_from_directory(
    data_dir,
    validation_split=0.2,
    subset="validation",
    seed=12,
    image_size=(img_height, img_width),
    batch_size=batch_size)
Found 3400 files belonging to 2 classes.
Using 680 files for validation.

We can use class_names the label of the output dataset. The labels will correspond to the directory name in alphabetical order.

class_names = train_ds.class_names
print(class_names)
['cat', 'dog']

2. Recheck the data

for image_batch, labels_batch in train_ds:
    print(image_batch.shape)
    print(labels_batch.shape)
    break
(8, 224, 224, 3)
(8,)
  • Image_batch is the tensor of shape (8, 224, 224, 3). This is a batch of 8 pictures of shape 224x224x3 (the last dimension refers to color channel RGB).
  • Label_batch is the tensor of shape (8,), and these labels correspond to 8 pictures

3. Configure dataset

  • shuffle(): scramble data. For a detailed description of this function, please refer to: https://zhuanlan.zhihu.com/p/42417456
  • prefetch(): prefetch data and speed up operation. For details, please refer to my previous two articles, which are explained in them.
  • cache(): cache data sets into memory to speed up operation
AUTOTUNE = tf.data.AUTOTUNE

def preprocess_image(image,label):
    return (image/255.0,label)

# Normalization processing
train_ds = train_ds.map(preprocess_image, num_parallel_calls=AUTOTUNE)
val_ds   = val_ds.map(preprocess_image, num_parallel_calls=AUTOTUNE)

train_ds = train_ds.cache().shuffle(1000).prefetch(buffer_size=AUTOTUNE)
val_ds   = val_ds.cache().prefetch(buffer_size=AUTOTUNE)

If AttributeError: module 'tensorflow._api.v2.data' has no attribute 'AUTOTUNE' error is reported, replace AUTOTUNE = tf.data.AUTOTUNE with autotune = tf.data.empirical.autotune. This error is caused by version problem.

4. Visual data

plt.figure(figsize=(15, 10))  # The width of the figure is 15 and the height is 10

for images, labels in train_ds.take(1):
    for i in range(8):
        
        ax = plt.subplot(5, 8, i + 1) 
        plt.imshow(images[i])
        plt.title(class_names[labels[i]])
        
        plt.axis("off")

3, Build VG-16 network

VGG advantages and disadvantages analysis:

  • VGG benefits

The structure of VGG is very simple. The whole network uses the same convolution kernel size (3x3) and maximum pool size (2x2).

  • VGG disadvantages

1) The training time is too long and it is difficult to adjust parameters. 2) the required storage capacity is large, which is not conducive to deployment. For example, the size of the VGG-16 weight value file is more than 500 MB, which is not conducive to installation in the embedded system.

Structure description:

  • 13 convolutional layers, represented by blockX_convX respectively
  • Three fully connected layers are represented by fcX and predictions respectively
  • Five pool layers are represented by blockX_pool

VGG-16 contains 16 hidden layers (13 convolution layers and 3 full connection layers), so it is called VGG-16


from tensorflow.keras import layers, models, Input
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Dense, Flatten, Dropout

def VGG16(nb_classes, input_shape):
    input_tensor = Input(shape=input_shape)
    # 1st block
    x = Conv2D(64, (3,3), activation='relu', padding='same',name='block1_conv1')(input_tensor)
    x = Conv2D(64, (3,3), activation='relu', padding='same',name='block1_conv2')(x)
    x = MaxPooling2D((2,2), strides=(2,2), name = 'block1_pool')(x)
    # 2nd block
    x = Conv2D(128, (3,3), activation='relu', padding='same',name='block2_conv1')(x)
    x = Conv2D(128, (3,3), activation='relu', padding='same',name='block2_conv2')(x)
    x = MaxPooling2D((2,2), strides=(2,2), name = 'block2_pool')(x)
    # 3rd block
    x = Conv2D(256, (3,3), activation='relu', padding='same',name='block3_conv1')(x)
    x = Conv2D(256, (3,3), activation='relu', padding='same',name='block3_conv2')(x)
    x = Conv2D(256, (3,3), activation='relu', padding='same',name='block3_conv3')(x)
    x = MaxPooling2D((2,2), strides=(2,2), name = 'block3_pool')(x)
    # 4th block
    x = Conv2D(512, (3,3), activation='relu', padding='same',name='block4_conv1')(x)
    x = Conv2D(512, (3,3), activation='relu', padding='same',name='block4_conv2')(x)
    x = Conv2D(512, (3,3), activation='relu', padding='same',name='block4_conv3')(x)
    x = MaxPooling2D((2,2), strides=(2,2), name = 'block4_pool')(x)
    # 5th block
    x = Conv2D(512, (3,3), activation='relu', padding='same',name='block5_conv1')(x)
    x = Conv2D(512, (3,3), activation='relu', padding='same',name='block5_conv2')(x)
    x = Conv2D(512, (3,3), activation='relu', padding='same',name='block5_conv3')(x)
    x = MaxPooling2D((2,2), strides=(2,2), name = 'block5_pool')(x)
    # full connection
    x = Flatten()(x)
    x = Dense(4096, activation='relu',  name='fc1')(x)
    x = Dense(4096, activation='relu', name='fc2')(x)
    output_tensor = Dense(nb_classes, activation='softmax', name='predictions')(x)

    model = Model(input_tensor, output_tensor)
    return model

model=VGG16(1000, (img_width, img_height, 3))
model.summary()
Model: "model"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_1 (InputLayer)         [(None, 224, 224, 3)]     0         
_________________________________________________________________
block1_conv1 (Conv2D)        (None, 224, 224, 64)      1792      
_________________________________________________________________
block1_conv2 (Conv2D)        (None, 224, 224, 64)      36928     
_________________________________________________________________
block1_pool (MaxPooling2D)   (None, 112, 112, 64)      0         
_________________________________________________________________
block2_conv1 (Conv2D)        (None, 112, 112, 128)     73856     
_________________________________________________________________
block2_conv2 (Conv2D)        (None, 112, 112, 128)     147584    
_________________________________________________________________
block2_pool (MaxPooling2D)   (None, 56, 56, 128)       0         
_________________________________________________________________
block3_conv1 (Conv2D)        (None, 56, 56, 256)       295168    
_________________________________________________________________
block3_conv2 (Conv2D)        (None, 56, 56, 256)       590080    
_________________________________________________________________
block3_conv3 (Conv2D)        (None, 56, 56, 256)       590080    
_________________________________________________________________
block3_pool (MaxPooling2D)   (None, 28, 28, 256)       0         
_________________________________________________________________
block4_conv1 (Conv2D)        (None, 28, 28, 512)       1180160   
_________________________________________________________________
block4_conv2 (Conv2D)        (None, 28, 28, 512)       2359808   
_________________________________________________________________
block4_conv3 (Conv2D)        (None, 28, 28, 512)       2359808   
_________________________________________________________________
block4_pool (MaxPooling2D)   (None, 14, 14, 512)       0         
_________________________________________________________________
block5_conv1 (Conv2D)        (None, 14, 14, 512)       2359808   
_________________________________________________________________
block5_conv2 (Conv2D)        (None, 14, 14, 512)       2359808   
_________________________________________________________________
block5_conv3 (Conv2D)        (None, 14, 14, 512)       2359808   
_________________________________________________________________
block5_pool (MaxPooling2D)   (None, 7, 7, 512)         0         
_________________________________________________________________
flatten (Flatten)            (None, 25088)             0         
_________________________________________________________________
fc1 (Dense)                  (None, 4096)              102764544 
_________________________________________________________________
fc2 (Dense)                  (None, 4096)              16781312  
_________________________________________________________________
predictions (Dense)          (None, 1000)              4097000   
=================================================================
Total params: 138,357,544
Trainable params: 138,357,544
Non-trainable params: 0
_________________________________________________________________

4, Compile

Before preparing to train the model, you need to make some more settings. The following contents are added in the compilation step of the model:

  • Loss function (loss): used to measure the accuracy of the model during training.
  • optimizer: determines how the model updates based on the data it sees and its own loss function.
  • metrics: used to monitor training and testing steps. The following example uses the accuracy rate, that is, the ratio of images correctly classified.
model.compile(optimizer="adam",
              loss     ='sparse_categorical_crossentropy',
              metrics  =['accuracy'])

5, Training model

from tqdm import tqdm
import tensorflow.keras.backend as K

epochs = 10
lr     = 1e-4

# Record training data for later analysis
history_train_loss     = []
history_train_accuracy = []
history_val_loss       = []
history_val_accuracy   = []

for epoch in range(epochs):
    train_total = len(train_ds)
    val_total   = len(val_ds)
    
    """
    total: Expected number of iterations
    ncols: Control progress bar width
    mininterval: Minimum progress update interval, in seconds (default: 0).1)
    """
    with tqdm(total=train_total, desc=f'Epoch {epoch + 1}/{epochs}',mininterval=1,ncols=100) as pbar:
        
        lr = lr*0.92
        K.set_value(model.optimizer.lr, lr)
        
        for image,label in train_ds:      
            """
            Training model, simple understanding train_on_batch That is: it is better than model.fit()A more advanced usage
            
            Want to know more train_on_batch My classmates,
            Take a look at my article: https://mtyjkh.blog.csdn.net/article/details/119506151
            """
            history = model.train_on_batch(image,label)
            
            train_loss     = history[0]
            train_accuracy = history[1]
            
            pbar.set_postfix({"loss": "%.4f"%train_loss,
                              "accuracy":"%.4f"%train_accuracy,
                              "lr": K.get_value(model.optimizer.lr)})
            pbar.update(1)
        history_train_loss.append(train_loss)
        history_train_accuracy.append(train_accuracy)
            
    print('Start verification!')
    
    with tqdm(total=val_total, desc=f'Epoch {epoch + 1}/{epochs}',mininterval=0.3,ncols=100) as pbar:

        for image,label in val_ds:      
            
            history = model.test_on_batch(image,label)
            
            val_loss     = history[0]
            val_accuracy = history[1]
            
            pbar.set_postfix({"loss": "%.4f"%val_loss,
                              "accuracy":"%.4f"%val_accuracy})
            pbar.update(1)
        history_val_loss.append(val_loss)
        history_val_accuracy.append(val_accuracy)
            
    print('End verification!')
    print("verification loss Is:%.4f"%val_loss)
    print("The verification accuracy is:%.4f"%val_accuracy)
Epoch 1/10: 100%|████████| 340/340 [00:23<00:00, 14.36it/s, loss=1.1077, accuracy=0.6250, lr=9.2e-5]
Start verification!
Epoch 1/10: 100%|█████████████████████| 85/85 [00:02<00:00, 36.55it/s, loss=0.9331, accuracy=0.6250]
End verification!
verification loss Is: 0.9331
 Verification accuracy: 0.6250
Epoch 2/10: 100%|███████| 340/340 [00:19<00:00, 17.49it/s, loss=0.4633, accuracy=0.6250, lr=8.46e-5]

......

Epoch 9/10: 100%|███████| 340/340 [00:19<00:00, 17.36it/s, loss=0.0112, accuracy=1.0000, lr=4.72e-5]
Start verification!
Epoch 9/10: 100%|█████████████████████| 85/85 [00:01<00:00, 43.46it/s, loss=0.0302, accuracy=1.0000]
End verification!
verification loss Is: 0.0302
 The verification accuracy is: 1.0000
Epoch 10/10: 100%|██████| 340/340 [00:19<00:00, 17.22it/s, loss=0.0000, accuracy=1.0000, lr=4.34e-5]
Start verification!
Epoch 10/10: 100%|████████████████████| 85/85 [00:02<00:00, 42.15it/s, loss=0.0231, accuracy=1.0000]
End verification!
verification loss Is: 0.0231
 The verification accuracy is: 1.0000
# This is our previous training method.
# history = model.fit(
#     train_ds,
#     validation_data=val_ds,
#     epochs=epochs
# )

6, Model evaluation

epochs_range = range(epochs)

plt.figure(figsize=(12, 4))
plt.subplot(1, 2, 1)

plt.plot(epochs_range, history_train_accuracy, label='Training Accuracy')
plt.plot(epochs_range, history_val_accuracy, label='Validation Accuracy')
plt.legend(loc='lower right')
plt.title('Training and Validation Accuracy')

plt.subplot(1, 2, 2)
plt.plot(epochs_range, history_train_loss, label='Training Loss')
plt.plot(epochs_range, history_val_loss, label='Validation Loss')
plt.legend(loc='upper right')
plt.title('Training and Validation Loss')
plt.show()

7, Save and load model

This is the simplest method of model saving and loading

# Save model
model.save('model/21_model.h5')
# Loading model
new_model = tf.keras.models.load_model('model/21_model.h5')

8, Forecast

# Use the loaded model (new_model) to see the prediction results

plt.figure(figsize=(18, 3))  # The width of the figure is 18 and the height is 5
plt.suptitle("Display of prediction results")

for images, labels in val_ds.take(1):
    for i in range(8):
        ax = plt.subplot(1,8, i + 1)  
        
        # display picture
        plt.imshow(images[i].numpy())
        
        # You need to add a dimension to the picture
        img_array = tf.expand_dims(images[i], 0) 
        
        # Use the model to predict the characters in the picture
        predictions = new_model.predict(img_array)
        plt.title(class_names[np.argmax(predictions)])

        plt.axis("off")

Finally, I'll give you another copy to help you get the data structure brush notes of offer s from first-line manufacturers such as BAT. It was written by the bosses of Google and Alibaba. It is very useful for students who have weak algorithms or need to improve (extraction code: 9go2):

Leetcode notes of Google and Alibaba

And the 7K + open source e-books I sorted out, there is always one that can help you 💖 (extraction code: 4eg0)

7K + open source e-books

💖 Praise first, then see, then collect, and form a good habit! 💖

Tags: TensorFlow Deep Learning

Posted on Tue, 26 Oct 2021 19:49:19 -0400 by keeB