7. Use Skills of BatchNormalization

@

Previous

The core idea of BatchNormalization

The basic idea of BN is quite intuitive: because the active input value (that is, the x=WU+B, U is the input) of the deep neural network before making the non-linear transformation increases with the depth of the network or the distribution gradually shifts or changes during the training process, the training convergence is slow, generally the overall distribution gradually approaches both ends of the upper and lower limits of the value range of the non-linear function. (For the Sigmoid function, that means activating the input value WU+B is a large negative or positive value) Thus, this causes the gradient of the lower-level neural network to disappear during reverse propagation, which is the essential reason for the slower and slower convergence of the training deep-level neural network. BN is to force the distribution of the input value of any neuron in each layer of the neural network back to the standard normal distribution with a mean of 0 and a variance of 1 through a certain standardized means, in fact, to force the distribution back to the ratio which is more and more biased. A more standard distribution, so that the active input values fall in the area where the non-linear function is sensitive to the input, so a small change in the input will result in a larger change in the loss function, meaning that the gradient will become larger so as to avoid the problem of gradient disappearance, and that a larger gradient means faster learning convergence and faster training.

Data Generator + Data Section Display

#Data Generation Training Set and Test Set
#Dog and cat data
from keras.preprocessing.image import ImageDataGenerator

IMSIZE = 224
train_generator = ImageDataGenerator(rescale=1. / 255).flow_from_directory(
    '../../data/dogs-vs-cats/smallData/train',
    target_size=(IMSIZE, IMSIZE),
    batch_size=10,
    class_mode='categorical'
)

validation_generator = ImageDataGenerator(rescale=1. / 255).flow_from_directory(
    '../../data/dogs-vs-cats/smallData/validation',
    target_size=(IMSIZE, IMSIZE),
    batch_size=10,
    class_mode='categorical'
)

The data comes from kaggle's cat and dog data

#Show X (Image) and Y (Dependent Variable)
import numpy as np

X, Y = next(validation_generator)
print(X.shape)
print(Y.shape)
Y[:, 0]

#Display Image
from matplotlib import pyplot as plt

plt.figure()
fig, ax = plt.subplots(2, 5)
fig.set_figheight(6)
fig.set_figwidth(15)
ax = ax.flatten()
X, Y = next(validation_generator)
for i in range(10): ax[i].imshow(X[i, :, :, ])

Logical Regression with BN

#Logistic Regression Model with BN
from keras.layers import Flatten, Input, BatchNormalization, Dense
from keras import Model

input_layer = Input([IMSIZE, IMSIZE, 3])
x = input_layer
x = BatchNormalization()(x)
x = Flatten()(x)
x = Dense(2, activation='softmax')(x)
output_layer = x
model1 = Model(input_layer, output_layer)
model1.summary()

#Logistic Regression Model with BN and Fitting
from keras.optimizers import Adam

model1.compile(loss='categorical_crossentropy',
               optimizer=Adam(lr=0.01),
               metrics=['accuracy'])
model1.fit_generator(train_generator,
                     epochs=200,
                     validation_data=validation_generator)

Batch Normalization is helpful in specific models, specific datasets

Wide Model with BN

#Extended, Wide Model with BN
from keras.layers import Conv2D, MaxPooling2D

n_channel = 100
input_layer = Input([IMSIZE, IMSIZE, 3])
x = input_layer
x = BatchNormalization()(x)
x = Conv2D(n_channel, [2, 2], activation='relu')(x)
x = MaxPooling2D([16, 16])(x)
x = Flatten()(x)
x = Dense(2, activation='softmax')(x)
output_layer = x
model2 = Model(input_layer, output_layer)
model2.summary()

# Compilation and Fitting of Wide Model with BN
model2.compile(loss='categorical_crossentropy',
               optimizer=Adam(lr=0.001),
               metrics=['accuracy'])
model2.fit_generator(train_generator,
                     epochs=200,
                     validation_data=validation_generator)

Would be much better than logistic regression with BN

Depth model with BN

#Depth model with BN
n_channel = 20
input_layer = Input([IMSIZE, IMSIZE, 3])
x = input_layer
x = BatchNormalization()(x)

for _ in range(7):
    x = Conv2D(n_channel, [2, 2], padding='same', activation='relu')(x)
    x = MaxPooling2D([2, 2])(x)
x = Flatten()(x)
x = Dense(2, activation='softmax')(x)
output_layer = x
model3 = Model(input_layer, output_layer)
model3.summary()

#Compilation and Fitting of Depth Model with BN
from keras.optimizers import Adam

model3.compile(loss='categorical_crossentropy',
               optimizer=Adam(lr=0.01),
               metrics=['accuracy'])
model3.fit_generator(train_generator,
                     epochs=200,
                     validation_data=validation_generator)

A depth model would be a little better

BatchNormalization is really helpful in many cases, but not in all cases.

GitHub download address:

Tensorflow1.15 Deep Learning

Tags: Python Anaconda jupyter TensorFlow Deep Learning

Posted on Sat, 04 Dec 2021 12:55:16 -0500 by ivory_kitten