Tensorflow notes -- building convolutional neural network

Convolution calculation process

Too many hidden layers and too many parameters to be trained may cause over fitting of the model. Convolution calculation can be used to extract image features, so as to avoid over fitting and improve the generalization ability of the model.

In convolution calculation, the depth of convolution kernel needs to match (equal to) the depth of input characteristic graph. There are parameters to be trained in the convolution kernel, which will be updated during back propagation.

Receptive field

Receptive field refers to the size of a pixel in each output characteristic image of convolutional neural network mapped on the original input image. Note that no matter which layer of neural network, the receptive field is for the most primitive feature map.

Whether the convolution kernel is good or not can not be determined strictly by its size. The number of calculations of one or more convolution networks with the same receptive field is related to the size of the characteristic graph. For example, two layers of 3 * 3 convolution kernels and one layer of 5 * 5 convolution kernels have the same size of receptive field, but in the characteristic graph with variable length of more than 10, the amount of calculation of two layers of 3 * 3 is less than that of one layer of 5 * 5 convolution kernels, and the performance is better.

All 0 fill

Sometimes, if we want to keep the image size unchanged before and after convolution, we can use all 0 filling to expand the size of the original feature image,

So that the scale remains unchanged before and after convolution.

Calculation formula of convolution output characteristic graph dimension
KaTeX parse error: Unknown column alignment: * at position 16: \begin{array}{*̲{20}{l}}{paddin...

TF description convolution computing layer

  filters=Number of convolution kernels,
  kernel_size=Convolution kernel size, # Square write core length, rectangular write (core height, core width w)
  strides=sliding step , # The horizontal and vertical steps are the same integer, otherwise the write (vertical step h, horizontal step w) defaults to 1
  padding="same" or "valid", # Filling with all zeros is "same", not "valid"
  activation="relu" or "sigmoid" or "tanh" or "softmax"etc., # If there are batch standardization operations, the activation function is not written here
  input_shape=(Height, width, number of channels) # Enter the dimension of the feature map, which can be omitted

Batch normalization (BN)

Neural network is more sensitive to the data near 0 (the change of activation function near 0 is more obvious), but with the increase of network layers, the data may deviate from the 0 mean value. Batch standardization is to pull the offset data back to near 0, which is often used between convolution operation and activation operation.

However, after the batch standardization operation, the data completely conforms to the standardization, which makes the activation function lose the nonlinear characteristics. Therefore, in the BN operation, two other trainable parameters are introduced for each convolution kernel: scaling factor and offset factor, which optimizes the width and offset of the characteristic data distribution and ensures the nonlinear expression of the network.

Standardization: make the data conform to the distribution of 0 mean and 1 standard deviation

Batch Standardization: standardize a small batch of data

The BN layer can be added between the convolution layer and the activation layer with reference to the following code:

model = tf.keras.models.Sequential([
  Conv2D(Filters=6, kernel_size=(5, 5), padding="same"),
  BatchNormalization(),  # BN layer
  MaxPool2D(pool_size=(2, 2), strides=2, padding="same"),


Used to reduce the amount of feature data

Maximum pooling can extract image texture, and mean pooling can preserve background features.

TF description pooling

  pool_size=Pool core size, # The format is the same as the above core
  strides=Pool step, # The format above is the same as that of the pool core by default
  padding="valid" or "same"

  pool_size=Pool core size,
  strides=Pool step,
  padding="valid" or "same"


In order to prevent over fitting in the process of neural network training, some neurons in the hidden layer are often temporarily discarded from the neural network according to a certain proportion. When the neural network is used, all neurons are restored.

TF provides the Dropout function

####Tf.keras.layers.dropout (probability of abandonment)

model = tf.keras.models.Sequential([
  Conv2D(filters=6, kernel_size=(5, 5), padding="same"),
  MaxPool2D(pool_size=(2, 2), strides=2, padding='same'),
  Dropout(0.2),  # Dropout layer

Convolutional neural network

The features are extracted by convolution kernel and sent to the fully connected network.

Convolution is the feature extractor, which is CBABD

  • C: Conv2D()
  • B: BatchNormalization()
  • A: Activation()
  • P: MaxPool2D()
  • D: Dropout()

Tags: TensorFlow Deep Learning CNN

Posted on Wed, 10 Nov 2021 13:55:52 -0500 by 23style