Principle of convolution operation
Reasons why the width and height of the output characteristic diagram are different from that of the input
- Boundary effect, some small boxes exceed the boundary (can be solved by filling the input)
- stride
Boundary effect solution
Fill the original input image by adjusting padding = 'same' (the default value is' valid ', indicating no filling)
Maximum pooling operation
-
After each MaxPooling2D, the size of the feature image will be halved (for example, from 26 * 26 to 13 * 13). This is because we will down sample the image
-
The down sampling of the maximum pool layer usually uses a 2 * 2 window and step 2, while the convolution usually uses a 3 * 3 window and step 1
Reasons for using MaxPooling layer
- By down sampling the global samples, the learning model can view the whole sample from a larger dimension, from local -- > Global
- Reduce computing expenses and prevent over fitting
Of course, there are many ways to realize down sampling. The use of MaxPooling layer is only one of them. You can also use the average pooling layer to replace the maximum pooling. The average pooling layer will calculate the average value in the box rather than the maximum value, but the effect of the maximum pooling layer is obviously better (the average pooling layer makes the features of the region less obvious, that is, it weakens the original feature information of the image)
Cat dog identification case
The following is a cat dog identification case on Kaggle
5-4 data preprocessing
import os, shutil # read in data # Original data path original_dataset_dir = r'E:\code\PythonDeep\DataSet\dogs-vs-cats\train' # We need to create the root directory of the file base_dir = r'E:\code\PythonDeep\DataSet\sampledata' os.mkdir(base_dir) # Training set data folder train_dir = os.path.join(base_dir, "train") os.mkdir(train_dir) # Validation set root directory validation_dir = os.path.join(base_dir, "validation") os.mkdir(validation_dir) # Test set root directory test_dir = os.path.join(base_dir, "test") os.mkdir(test_dir) # Training set directory # Cat image training set directory train_cats_dir = os.path.join(train_dir, 'cats') os.mkdir(train_cats_dir) # Dog image training set directory train_dogs_dir = os.path.join(train_dir, 'dogs') os.mkdir(train_dogs_dir) # Verification set directory # Cat verification set image directory validation_cats_dir = os.path.join(validation_dir, 'cats') os.mkdir(validation_cats_dir) # Dog verification set image directory validation_dogs_dir = os.path.join(validation_dir, 'dogs') os.mkdir(validation_dogs_dir) # Under test set directory # Cat test set image directory test_cats_dir = os.path.join(test_dir, 'cats') os.mkdir(test_cats_dir) # Dog test set image directory test_dogs_dir = os.path.join(test_dir, 'dogs') os.mkdir(test_dogs_dir) # ========================================================= # Copy the first 1000 cat images to train_cats_dir # File name: cat.0.jpg fnames = ['cat.{}.jpg'.format(i) for i in range(1000)] # regular expression for fname in fnames: src = os.path.join(original_dataset_dir, fname) # source address dst = os.path.join(train_cats_dir, fname) # Destination address shutil.copyfile(src, dst) # Copy 500 pictures of cats to validation_cats_dir fnames = ['cat.{}.jpg'.format(i) for i in range(1000, 1500)] for fname in fnames: src = os.path.join(original_dataset_dir, fname) dst = os.path.join(validation_cats_dir, fname) shutil.copyfile(src, dst) # Copy 500 pictures of cats to test_cats_dir fnames = ['cat.{}.jpg'.format(i) for i in range(1500, 2000)] for fname in fnames: src = os.path.join(original_dataset_dir, fname) dst = os.path.join(test_cats_dir, fname) shutil.copyfile(src, dst) # Copy 1000 pictures of dogs to train_dogs_dir fnames = ['dog.{}.jpg'.format(i) for i in range(1000)] for fname in fnames: src = os.path.join(original_dataset_dir, fname) dst = os.path.join(train_dogs_dir, fname) shutil.copyfile(src, dst) # Copy 500 pictures of dogs to validation_dogs_dir fnames = ['dog.{}.jpg'.format(i) for i in range(1000, 1500)] for fname in fnames: src = os.path.join(original_dataset_dir, fname) dst = os.path.join(validation_dogs_dir, fname) shutil.copyfile(src, dst) # Copy 500 pictures of dogs to test_dogs_dir fnames = ['dog.{}.jpg'.format(i) for i in range(1500, 2000)] for fname in fnames: src = os.path.join(original_dataset_dir, fname) dst = os.path.join(test_dogs_dir, fname) shutil.copyfile(src, dst) # Print to see if the data set is correct print('total training cat images:', len(os.listdir(train_cats_dir))) print('total training dog images:', len(os.listdir(train_dogs_dir))) print('total validation cat images:', len(os.listdir(validation_cats_dir))) print('total validation dog images:', len(os.listdir(validation_dogs_dir))) print('total test cat images:', len(os.listdir(test_cats_dir))) print('total test dog images:', len(os.listdir(test_dogs_dir)))
total training cat images: 1000 total training dog images: 1000 total validation cat images: 500 total validation dog images: 500 total test cat images: 500 total test dog images: 500
Building a deep learning network
- In the convolution depth learning network, the depth of the feature map is gradually increasing, from 32 to 128, while the size is gradually decreasing, from 150 * 150 to 7 * 7
- Since our problem is a binary classification problem (distinguishing between cats and dogs), we use sigmoid for the final activation layer
5-5 instantiation of careful convolution neural network for cat and dog classification
# Constructing convolution neural network model from keras import layers from keras import models model = models.Sequential() # Convolution layer model.add(layers.Conv2D(32, (3, 3), activation = 'relu', input_shape = (150, 150, 3))) model.add(layers.MaxPooling2D((2, 2))) model.add(layers.Conv2D(64, (3, 3), activation = 'relu')) model.add(layers.MaxPooling2D((2, 2))) model.add(layers.Conv2D(128, (3, 3), activation = 'relu')) model.add(layers.MaxPooling2D((2, 2))) model.add(layers.Conv2D(128, (3, 3), activation = 'relu')) model.add(layers.MaxPooling2D((2, 2))) # Full connection layer model.add(layers.Flatten()) # Change the output data into a one-dimensional vector model.add(layers.Dense(512, activation = 'relu')) model.add(layers.Dense(1, activation = 'sigmoid')) # View model overview model.summary()
Model: "sequential_2" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= conv2d_5 (Conv2D) (None, 148, 148, 32) 896 _________________________________________________________________ max_pooling2d_5 (MaxPooling2 (None, 74, 74, 32) 0 _________________________________________________________________ conv2d_6 (Conv2D) (None, 72, 72, 64) 18496 _________________________________________________________________ max_pooling2d_6 (MaxPooling2 (None, 36, 36, 64) 0 _________________________________________________________________ conv2d_7 (Conv2D) (None, 34, 34, 128) 73856 _________________________________________________________________ max_pooling2d_7 (MaxPooling2 (None, 17, 17, 128) 0 _________________________________________________________________ conv2d_8 (Conv2D) (None, 15, 15, 128) 147584 _________________________________________________________________ max_pooling2d_8 (MaxPooling2 (None, 7, 7, 128) 0 _________________________________________________________________ flatten_2 (Flatten) (None, 6272) 0 _________________________________________________________________ dense_3 (Dense) (None, 512) 3211776 _________________________________________________________________ dense_4 (Dense) (None, 1) 513 ================================================================= Total params: 3,453,121 Trainable params: 3,453,121 Non-trainable params: 0 _________________________________________________________________
5-6 configuring models for training
from keras import optimizers model.compile(loss = 'binary_crossentropy', optimizer = optimizers.RMSprop(lr = 1e-4), metrics = ['acc']) # Note the distinction between l and 1, Lr = one (Chinese) e minus 4
5-7 reading images from the directory using ImageDataGenerator
Next, we subtract a painting from the data into a preprocessed floating-point tensor
- Read image file
- Convert JEPG to RGB
- Convert pixel grids to floating point numbers
- Convert data from 0 to 255 to a decimal point [0, 1] range
from keras.preprocessing.image import ImageDataGenerator train_datagen = ImageDataGenerator(rescale = 1./255) # Divide each element of data by 255 test_datagen = ImageDataGenerator(rescale = 1./255) # Resize image train_generator = train_datagen.flow_from_directory( train_dir, target_size = (150, 150), batch_size = 20, class_mode = 'binary') validation_generator = test_datagen.flow_from_directory( validation_dir, target_size = (150, 150), batch_size = 20, class_mode = 'binary') # Information about print generator for data_batch, labels_batch in train_generator: print('data batch shape: ', data_batch.shape) print('labels batch shape: ', labels_batch.shape) break
Found 2000 images belonging to 2 classes. Found 1000 images belonging to 2 classes. data batch shape: (20, 150, 150, 3) labels batch shape: (20,)
5-8 fitting model using batch generator
# Training model history = model.fit_generator(train_generator, # A training iterator that continuously generates data steps_per_epoch = 100, # Each batch has 20 samples, and it takes 100 times to read 2000 samples epochs = 30, validation_data = validation_generator, validation_steps = 50)
Epoch 1/30 100/100 [==============================] - 42s 418ms/step - loss: 0.6863 - acc: 0.5410 - val_loss: 0.6788 - val_acc: 0.5510 Epoch 2/30 100/100 [==============================] - 39s 389ms/step - loss: 0.6441 - acc: 0.6320 - val_loss: 0.7571 - val_acc: 0.6170 Epoch 3/30 100/100 [==============================] - 39s 385ms/step - loss: 0.5943 - acc: 0.6820 - val_loss: 0.5072 - val_acc: 0.6700 Epoch 4/30 100/100 [==============================] - 39s 394ms/step - loss: 0.5617 - acc: 0.6995 - val_loss: 0.5658 - val_acc: 0.6700 Epoch 5/30 100/100 [==============================] - 39s 392ms/step - loss: 0.5326 - acc: 0.7185 - val_loss: 0.4614 - val_acc: 0.6870 Epoch 6/30 100/100 [==============================] - 39s 394ms/step - loss: 0.5012 - acc: 0.7535 - val_loss: 0.6301 - val_acc: 0.6940 Epoch 7/30 100/100 [==============================] - 40s 397ms/step - loss: 0.4751 - acc: 0.7590 - val_loss: 0.6279 - val_acc: 0.7090 Epoch 8/30 100/100 [==============================] - 40s 398ms/step - loss: 0.4399 - acc: 0.8010 - val_loss: 0.4719 - val_acc: 0.6990 Epoch 9/30 100/100 [==============================] - 42s 419ms/step - loss: 0.4167 - acc: 0.8125 - val_loss: 0.4850 - val_acc: 0.7340 Epoch 10/30 100/100 [==============================] - 43s 426ms/step - loss: 0.3905 - acc: 0.8150 - val_loss: 0.5103 - val_acc: 0.7260 Epoch 11/30 100/100 [==============================] - 40s 397ms/step - loss: 0.3642 - acc: 0.8365 - val_loss: 0.5101 - val_acc: 0.7410 Epoch 12/30 100/100 [==============================] - 42s 416ms/step - loss: 0.3384 - acc: 0.8555 - val_loss: 0.6325 - val_acc: 0.7300 Epoch 13/30 100/100 [==============================] - 42s 416ms/step - loss: 0.3246 - acc: 0.8635 - val_loss: 0.9336 - val_acc: 0.7340 Epoch 14/30 100/100 [==============================] - 39s 389ms/step - loss: 0.2980 - acc: 0.8725 - val_loss: 1.0578 - val_acc: 0.7220 Epoch 15/30 100/100 [==============================] - 39s 390ms/step - loss: 0.2808 - acc: 0.8725 - val_loss: 1.1070 - val_acc: 0.7260 Epoch 16/30 100/100 [==============================] - 39s 390ms/step - loss: 0.2523 - acc: 0.8990 - val_loss: 1.0064 - val_acc: 0.7340 Epoch 17/30 100/100 [==============================] - 40s 400ms/step - loss: 0.2322 - acc: 0.9000 - val_loss: 0.6108 - val_acc: 0.7520 Epoch 18/30 100/100 [==============================] - 40s 396ms/step - loss: 0.2151 - acc: 0.9190 - val_loss: 0.8014 - val_acc: 0.7320 Epoch 19/30 100/100 [==============================] - 39s 391ms/step - loss: 0.1902 - acc: 0.9305 - val_loss: 0.3588 - val_acc: 0.7320 Epoch 20/30 100/100 [==============================] - 39s 391ms/step - loss: 0.1704 - acc: 0.9370 - val_loss: 0.4965 - val_acc: 0.7300 Epoch 21/30 100/100 [==============================] - 39s 388ms/step - loss: 0.1577 - acc: 0.9415 - val_loss: 0.3101 - val_acc: 0.7230 Epoch 22/30 100/100 [==============================] - 39s 392ms/step - loss: 0.1363 - acc: 0.9510 - val_loss: 0.4775 - val_acc: 0.7390 Epoch 23/30 100/100 [==============================] - 39s 389ms/step - loss: 0.1243 - acc: 0.9570 - val_loss: 0.4934 - val_acc: 0.7370 Epoch 24/30 100/100 [==============================] - 41s 413ms/step - loss: 0.1063 - acc: 0.9710 - val_loss: 1.0973 - val_acc: 0.7130 Epoch 25/30 100/100 [==============================] - 40s 396ms/step - loss: 0.0952 - acc: 0.9710 - val_loss: 1.7752 - val_acc: 0.7110 Epoch 26/30 100/100 [==============================] - 39s 390ms/step - loss: 0.0787 - acc: 0.9780 - val_loss: 0.5990 - val_acc: 0.7390 Epoch 27/30 100/100 [==============================] - 41s 411ms/step - loss: 0.0687 - acc: 0.9830 - val_loss: 0.7672 - val_acc: 0.7330 Epoch 28/30 100/100 [==============================] - 44s 437ms/step - loss: 0.0608 - acc: 0.9825 - val_loss: 0.6554 - val_acc: 0.7400 Epoch 29/30 100/100 [==============================] - 41s 407ms/step - loss: 0.0514 - acc: 0.9875 - val_loss: 0.4879 - val_acc: 0.7340 Epoch 30/30 100/100 [==============================] - 40s 399ms/step - loss: 0.0443 - acc: 0.9870 - val_loss: 0.4517 - val_acc: 0.7260
5-9 save model
model.save('cats_and_dogs_small_1.h5')
5-10 draw the loss curve and progress curve during training
import matplotlib.pyplot as plt acc = history.history['acc'] val_acc = history.history['val_acc'] loss = history.history['loss'] val_loss = history.history['val_loss'] epochs = range(1, len(acc) + 1) # mapping plt.plot(epochs, acc, 'bo', label = 'Training acc') plt.plot(epochs, val_acc, 'b', label = 'Validation acc') plt.title('Training and validation accuracy') plt.legend() plt.figure() # Draw image 2 plt.plot(epochs, loss, 'bo', label = 'Training loss') plt.plot(epochs, val_loss, 'b', label = 'Validation loss') plt.title('Training and validation loss') plt.legend() plt.show()
Model accuracy curve
Model loss curve
summary
From the above figure, the trained model still has the characteristics of over fitting on the whole. The training accuracy is directly close to 100, while the verification accuracy is always about 70. The training accuracy should reach the minimum value around the fifth round. In order to solve the over fitting problem, we will use the data enhancement method in the next section.
Write at the end
Note: the code of this article comes from Python deep learning and is uploaded in the form of electronic notes. It is only for learning reference. The authors have run successfully. If there is any omission, please practice the author of this article
Ladies and gentlemen, I've seen it here. Please use your fingers to praise the blogger 8. Your support is the author's greatest creative power!
<(^-^)>
Lack of talent and learning. If there is any mistake, please correct it
This article is only for the purpose of learning and communication, not for any commercial purpose. If copyright issues are involved, please contact the author as soon as possible