Chapter 5 Part 2 case dogs VS Cats

Principle of convolution operation

Reasons why the width and height of the output characteristic diagram are different from that of the input

  • Boundary effect, some small boxes exceed the boundary (can be solved by filling the input)
  • stride

Boundary effect solution
Fill the original input image by adjusting padding = 'same' (the default value is' valid ', indicating no filling)

Maximum pooling operation

  • After each MaxPooling2D, the size of the feature image will be halved (for example, from 26 * 26 to 13 * 13). This is because we will down sample the image

  • The down sampling of the maximum pool layer usually uses a 2 * 2 window and step 2, while the convolution usually uses a 3 * 3 window and step 1

Reasons for using MaxPooling layer

  • By down sampling the global samples, the learning model can view the whole sample from a larger dimension, from local -- > Global
  • Reduce computing expenses and prevent over fitting

Of course, there are many ways to realize down sampling. The use of MaxPooling layer is only one of them. You can also use the average pooling layer to replace the maximum pooling. The average pooling layer will calculate the average value in the box rather than the maximum value, but the effect of the maximum pooling layer is obviously better (the average pooling layer makes the features of the region less obvious, that is, it weakens the original feature information of the image)

Cat dog identification case

The following is a cat dog identification case on Kaggle

5-4 data preprocessing

import os, shutil

# read in data
# Original data path
original_dataset_dir = r'E:\code\PythonDeep\DataSet\dogs-vs-cats\train'

# We need to create the root directory of the file
base_dir = r'E:\code\PythonDeep\DataSet\sampledata'
os.mkdir(base_dir)

# Training set data folder
train_dir = os.path.join(base_dir, "train")
os.mkdir(train_dir)

# Validation set root directory
validation_dir = os.path.join(base_dir, "validation")
os.mkdir(validation_dir)

# Test set root directory
test_dir = os.path.join(base_dir, "test")
os.mkdir(test_dir)

# Training set directory
# Cat image training set directory
train_cats_dir = os.path.join(train_dir, 'cats')
os.mkdir(train_cats_dir)

# Dog image training set directory
train_dogs_dir = os.path.join(train_dir, 'dogs')
os.mkdir(train_dogs_dir)

# Verification set directory
# Cat verification set image directory
validation_cats_dir = os.path.join(validation_dir, 'cats')
os.mkdir(validation_cats_dir)

# Dog verification set image directory
validation_dogs_dir = os.path.join(validation_dir, 'dogs')
os.mkdir(validation_dogs_dir)

# Under test set directory
# Cat test set image directory
test_cats_dir = os.path.join(test_dir, 'cats')
os.mkdir(test_cats_dir)

# Dog test set image directory
test_dogs_dir = os.path.join(test_dir, 'dogs')
os.mkdir(test_dogs_dir)

# =========================================================

# Copy the first 1000 cat images to train_cats_dir
# File name: cat.0.jpg
fnames = ['cat.{}.jpg'.format(i) for i in range(1000)] # regular expression 
for fname in fnames:
    src = os.path.join(original_dataset_dir, fname) # source address
    dst = os.path.join(train_cats_dir, fname) # Destination address
    shutil.copyfile(src, dst)
    
# Copy 500 pictures of cats to validation_cats_dir
fnames = ['cat.{}.jpg'.format(i) for i in range(1000, 1500)]
for fname in fnames:
    src = os.path.join(original_dataset_dir, fname)
    dst = os.path.join(validation_cats_dir, fname)
    shutil.copyfile(src, dst)
    
# Copy 500 pictures of cats to test_cats_dir
fnames = ['cat.{}.jpg'.format(i) for i in range(1500, 2000)]
for fname in fnames:
    src = os.path.join(original_dataset_dir, fname)
    dst = os.path.join(test_cats_dir, fname)
    shutil.copyfile(src, dst)
    
# Copy 1000 pictures of dogs to train_dogs_dir
fnames = ['dog.{}.jpg'.format(i) for i in range(1000)]
for fname in fnames:
    src = os.path.join(original_dataset_dir, fname)
    dst = os.path.join(train_dogs_dir, fname)
    shutil.copyfile(src, dst)
    
# Copy 500 pictures of dogs to validation_dogs_dir
fnames = ['dog.{}.jpg'.format(i) for i in range(1000, 1500)]
for fname in fnames:
    src = os.path.join(original_dataset_dir, fname)
    dst = os.path.join(validation_dogs_dir, fname)
    shutil.copyfile(src, dst)
    
# Copy 500 pictures of dogs to test_dogs_dir
fnames = ['dog.{}.jpg'.format(i) for i in range(1500, 2000)]
for fname in fnames:
    src = os.path.join(original_dataset_dir, fname)
    dst = os.path.join(test_dogs_dir, fname)
    shutil.copyfile(src, dst)
    
# Print to see if the data set is correct
print('total training cat images:', len(os.listdir(train_cats_dir)))
print('total training dog images:', len(os.listdir(train_dogs_dir)))
print('total validation cat images:', len(os.listdir(validation_cats_dir)))
print('total validation dog images:', len(os.listdir(validation_dogs_dir)))
print('total test cat images:', len(os.listdir(test_cats_dir)))
print('total test dog images:', len(os.listdir(test_dogs_dir)))
total training cat images: 1000
total training dog images: 1000
total validation cat images: 500
total validation dog images: 500
total test cat images: 500
total test dog images: 500

Building a deep learning network

  • In the convolution depth learning network, the depth of the feature map is gradually increasing, from 32 to 128, while the size is gradually decreasing, from 150 * 150 to 7 * 7
  • Since our problem is a binary classification problem (distinguishing between cats and dogs), we use sigmoid for the final activation layer

5-5 instantiation of careful convolution neural network for cat and dog classification

# Constructing convolution neural network model
from keras import layers
from keras import models

model = models.Sequential()

# Convolution layer
model.add(layers.Conv2D(32, (3, 3), activation = 'relu', input_shape = (150, 150, 3)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation = 'relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(128, (3, 3), activation = 'relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(128, (3, 3), activation = 'relu'))
model.add(layers.MaxPooling2D((2, 2)))

# Full connection layer
model.add(layers.Flatten()) # Change the output data into a one-dimensional vector
model.add(layers.Dense(512, activation = 'relu'))
model.add(layers.Dense(1, activation = 'sigmoid'))

# View model overview
model.summary()
Model: "sequential_2"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d_5 (Conv2D)            (None, 148, 148, 32)      896       
_________________________________________________________________
max_pooling2d_5 (MaxPooling2 (None, 74, 74, 32)        0         
_________________________________________________________________
conv2d_6 (Conv2D)            (None, 72, 72, 64)        18496     
_________________________________________________________________
max_pooling2d_6 (MaxPooling2 (None, 36, 36, 64)        0         
_________________________________________________________________
conv2d_7 (Conv2D)            (None, 34, 34, 128)       73856     
_________________________________________________________________
max_pooling2d_7 (MaxPooling2 (None, 17, 17, 128)       0         
_________________________________________________________________
conv2d_8 (Conv2D)            (None, 15, 15, 128)       147584    
_________________________________________________________________
max_pooling2d_8 (MaxPooling2 (None, 7, 7, 128)         0         
_________________________________________________________________
flatten_2 (Flatten)          (None, 6272)              0         
_________________________________________________________________
dense_3 (Dense)              (None, 512)               3211776   
_________________________________________________________________
dense_4 (Dense)              (None, 1)                 513       
=================================================================
Total params: 3,453,121
Trainable params: 3,453,121
Non-trainable params: 0
_________________________________________________________________

5-6 configuring models for training

from keras import optimizers

model.compile(loss = 'binary_crossentropy', 
              optimizer = optimizers.RMSprop(lr = 1e-4),
              metrics = ['acc']) # Note the distinction between l and 1, Lr = one (Chinese) e minus 4

5-7 reading images from the directory using ImageDataGenerator

Next, we subtract a painting from the data into a preprocessed floating-point tensor

  1. Read image file
  2. Convert JEPG to RGB
  3. Convert pixel grids to floating point numbers
  4. Convert data from 0 to 255 to a decimal point [0, 1] range
from keras.preprocessing.image import ImageDataGenerator

train_datagen = ImageDataGenerator(rescale = 1./255) # Divide each element of data by 255
test_datagen = ImageDataGenerator(rescale = 1./255)

# Resize image
train_generator = train_datagen.flow_from_directory(
    train_dir, 
    target_size = (150, 150), 
    batch_size = 20, 
    class_mode = 'binary')

validation_generator = test_datagen.flow_from_directory(
    validation_dir, 
    target_size = (150, 150), 
    batch_size = 20, 
    class_mode = 'binary')

# Information about print generator
for data_batch, labels_batch in train_generator:
    print('data batch shape: ', data_batch.shape)
    print('labels batch shape: ', labels_batch.shape)
    break
Found 2000 images belonging to 2 classes.
Found 1000 images belonging to 2 classes.
data batch shape:  (20, 150, 150, 3)
labels batch shape:  (20,)

5-8 fitting model using batch generator

# Training model
history = model.fit_generator(train_generator, # A training iterator that continuously generates data
                              steps_per_epoch = 100, # Each batch has 20 samples, and it takes 100 times to read 2000 samples
                              epochs = 30, 
                              validation_data = validation_generator, 
                              validation_steps = 50)
Epoch 1/30
100/100 [==============================] - 42s 418ms/step - loss: 0.6863 - acc: 0.5410 - val_loss: 0.6788 - val_acc: 0.5510
Epoch 2/30
100/100 [==============================] - 39s 389ms/step - loss: 0.6441 - acc: 0.6320 - val_loss: 0.7571 - val_acc: 0.6170
Epoch 3/30
100/100 [==============================] - 39s 385ms/step - loss: 0.5943 - acc: 0.6820 - val_loss: 0.5072 - val_acc: 0.6700
Epoch 4/30
100/100 [==============================] - 39s 394ms/step - loss: 0.5617 - acc: 0.6995 - val_loss: 0.5658 - val_acc: 0.6700
Epoch 5/30
100/100 [==============================] - 39s 392ms/step - loss: 0.5326 - acc: 0.7185 - val_loss: 0.4614 - val_acc: 0.6870
Epoch 6/30
100/100 [==============================] - 39s 394ms/step - loss: 0.5012 - acc: 0.7535 - val_loss: 0.6301 - val_acc: 0.6940
Epoch 7/30
100/100 [==============================] - 40s 397ms/step - loss: 0.4751 - acc: 0.7590 - val_loss: 0.6279 - val_acc: 0.7090
Epoch 8/30
100/100 [==============================] - 40s 398ms/step - loss: 0.4399 - acc: 0.8010 - val_loss: 0.4719 - val_acc: 0.6990
Epoch 9/30
100/100 [==============================] - 42s 419ms/step - loss: 0.4167 - acc: 0.8125 - val_loss: 0.4850 - val_acc: 0.7340
Epoch 10/30
100/100 [==============================] - 43s 426ms/step - loss: 0.3905 - acc: 0.8150 - val_loss: 0.5103 - val_acc: 0.7260
Epoch 11/30
100/100 [==============================] - 40s 397ms/step - loss: 0.3642 - acc: 0.8365 - val_loss: 0.5101 - val_acc: 0.7410
Epoch 12/30
100/100 [==============================] - 42s 416ms/step - loss: 0.3384 - acc: 0.8555 - val_loss: 0.6325 - val_acc: 0.7300
Epoch 13/30
100/100 [==============================] - 42s 416ms/step - loss: 0.3246 - acc: 0.8635 - val_loss: 0.9336 - val_acc: 0.7340
Epoch 14/30
100/100 [==============================] - 39s 389ms/step - loss: 0.2980 - acc: 0.8725 - val_loss: 1.0578 - val_acc: 0.7220
Epoch 15/30
100/100 [==============================] - 39s 390ms/step - loss: 0.2808 - acc: 0.8725 - val_loss: 1.1070 - val_acc: 0.7260
Epoch 16/30
100/100 [==============================] - 39s 390ms/step - loss: 0.2523 - acc: 0.8990 - val_loss: 1.0064 - val_acc: 0.7340
Epoch 17/30
100/100 [==============================] - 40s 400ms/step - loss: 0.2322 - acc: 0.9000 - val_loss: 0.6108 - val_acc: 0.7520
Epoch 18/30
100/100 [==============================] - 40s 396ms/step - loss: 0.2151 - acc: 0.9190 - val_loss: 0.8014 - val_acc: 0.7320
Epoch 19/30
100/100 [==============================] - 39s 391ms/step - loss: 0.1902 - acc: 0.9305 - val_loss: 0.3588 - val_acc: 0.7320
Epoch 20/30
100/100 [==============================] - 39s 391ms/step - loss: 0.1704 - acc: 0.9370 - val_loss: 0.4965 - val_acc: 0.7300
Epoch 21/30
100/100 [==============================] - 39s 388ms/step - loss: 0.1577 - acc: 0.9415 - val_loss: 0.3101 - val_acc: 0.7230
Epoch 22/30
100/100 [==============================] - 39s 392ms/step - loss: 0.1363 - acc: 0.9510 - val_loss: 0.4775 - val_acc: 0.7390
Epoch 23/30
100/100 [==============================] - 39s 389ms/step - loss: 0.1243 - acc: 0.9570 - val_loss: 0.4934 - val_acc: 0.7370
Epoch 24/30
100/100 [==============================] - 41s 413ms/step - loss: 0.1063 - acc: 0.9710 - val_loss: 1.0973 - val_acc: 0.7130
Epoch 25/30
100/100 [==============================] - 40s 396ms/step - loss: 0.0952 - acc: 0.9710 - val_loss: 1.7752 - val_acc: 0.7110
Epoch 26/30
100/100 [==============================] - 39s 390ms/step - loss: 0.0787 - acc: 0.9780 - val_loss: 0.5990 - val_acc: 0.7390
Epoch 27/30
100/100 [==============================] - 41s 411ms/step - loss: 0.0687 - acc: 0.9830 - val_loss: 0.7672 - val_acc: 0.7330
Epoch 28/30
100/100 [==============================] - 44s 437ms/step - loss: 0.0608 - acc: 0.9825 - val_loss: 0.6554 - val_acc: 0.7400
Epoch 29/30
100/100 [==============================] - 41s 407ms/step - loss: 0.0514 - acc: 0.9875 - val_loss: 0.4879 - val_acc: 0.7340
Epoch 30/30
100/100 [==============================] - 40s 399ms/step - loss: 0.0443 - acc: 0.9870 - val_loss: 0.4517 - val_acc: 0.7260

5-9 save model

model.save('cats_and_dogs_small_1.h5')

5-10 draw the loss curve and progress curve during training

import matplotlib.pyplot as plt

acc = history.history['acc']
val_acc = history.history['val_acc']
loss = history.history['loss']
val_loss = history.history['val_loss']

epochs = range(1, len(acc) + 1)

# mapping
plt.plot(epochs, acc, 'bo', label = 'Training acc')
plt.plot(epochs, val_acc, 'b', label = 'Validation acc')
plt.title('Training and validation accuracy')
plt.legend()

plt.figure()

# Draw image 2
plt.plot(epochs, loss, 'bo', label = 'Training loss')
plt.plot(epochs, val_loss, 'b', label = 'Validation loss')
plt.title('Training and validation loss')
plt.legend()

plt.show()

Model accuracy curve

Model loss curve

summary

From the above figure, the trained model still has the characteristics of over fitting on the whole. The training accuracy is directly close to 100, while the verification accuracy is always about 70. The training accuracy should reach the minimum value around the fifth round. In order to solve the over fitting problem, we will use the data enhancement method in the next section.

Write at the end

Note: the code of this article comes from Python deep learning and is uploaded in the form of electronic notes. It is only for learning reference. The authors have run successfully. If there is any omission, please practice the author of this article

Ladies and gentlemen, I've seen it here. Please use your fingers to praise the blogger 8. Your support is the author's greatest creative power!
<(^-^)>
Lack of talent and learning. If there is any mistake, please correct it
This article is only for the purpose of learning and communication, not for any commercial purpose. If copyright issues are involved, please contact the author as soon as possible

Tags: Python Pytorch Computer Vision Deep Learning

Posted on Wed, 06 Oct 2021 12:34:43 -0400 by fonster_mox