Install related libraries and packages
Use pycharm to create a new juan_ji.py file.
- numpy is the basic package of Python scientific computing.
- matplotlib is a library for drawing graphics commonly used in Python.
import numpy as np import h5py import matplotlib.pyplot as plt plt.rcParams['figure.figsize'] = (5.0, 4.0) plt.rcParams['image.interpolation'] = 'nearest' plt.rcParams['image.cmap'] = 'gray'
Convolution function
- Zero fill
- Convolution window
- Forward convolution
- Deconvolution (optional)
Zero fill
Zero fill is the addition of zeros around the boundaries of the image. As shown in the figure below, a picture of a cat is in RGB format. There are three picture channels with padding=2
The main benefits of filling are:
- Allows you to use CONV layers without having to reduce their height and width. This is important for building deeper networks because the height / width decreases with deeper layers. An important and special example is the "same" convolution, in which the height / width is accurately retained after one layer.
- It helps us keep more information at the edge of the image. If filling is not carried out, very few values of image edges in the convolution process will be affected by the filter, resulting in information loss.
Implement the zero fill function, which will use zero fill to process all image data of a batch X. Note that if you want to fill the array "a" with dimension (5, 5, 5, 5, 5), pad = 1 for the second dimension, pad = 3 for the fourth dimension, and pad = 0 for the rest, you can do this: a = np.pad (a, ((0,0), (1,1), (0,0), (3,3), (0,0)),'constant ', constant_values = (..,..)
# Define zero fill function def zero_pad(X, pad): """ Put the data set X All image boundaries are expanded with 0 pad Width and height. Parameters: X - Image dataset, dimension is (number of samples, image height, image width, number of image channels) pad - Integer, the filling amount of each image in the vertical and horizontal dimensions return: X_paded - The dimension of the expanded image data set is (number of samples, image height) + 2*pad,Image width + 2*pad,Number of image channels) """ X_paded = np.pad(X, ( (0, 0), # Number of samples, not filled (pad, pad), # Image height, you can think of filling x above and y below (x,y) (pad, pad), # Image width, you can fill x on the left and Y on the right (x,y) (0, 0)), # Number of channels, not filled 'constant', constant_values=0) # Continuous same value filling return X_paded
Test it
# # Test it # np.random.seed(1) # x = np.random.randn(4,3,3,2) # x_paded = zero_pad(x,2) # #see information # print ("x.shape =", x.shape) # print ("x_paded.shape =", x_paded.shape) # print ("x[1, 1] =", x[1, 1]) # print ("x_paded[1, 1] =", x_paded[1, 1]) # #Drawing # fig, axarr = plt.subplots(1,2) #One row and two columns # axarr[0].set_title('x') # axarr[0].imshow(x[0,:,:,0]) # axarr[1].set_title('x_paded') # axarr[1].imshow(x_paded[0,:,:,0])
Single step of convolution
Here, we want to realize the first step of convolution. We want to use a filter to convolute the input data.
In computer vision applications, each value in the left matrix corresponds to a pixel value. We compare its value with 3 ✖ 3 filters are multiplied and then summed to convolute the 3x3 filter with the image. We need to implement a function that can convolute a 3x3 filter with a separate slice block and output a real number. Now we begin to realize.
# Define convolution operation def conv_single_step(a_slice_prev, W, b): """ A parameter is applied to a segment of the active output of the previous layer W Defined filters. Here the slice size is the same as the filter size Parameters: a_slice_prev - A segment of input data with dimension (filter size, filter size, number of previous channels) W - Weight parameter, contained in a matrix, with dimensions (filter size, filter size, number of previous channels) b - Offset parameter, contained in a matrix, with dimension (1),1,1) return: Z - In the slice of input data X Upper convolution sliding window( w,b)Results. """ s = np.multiply(a_slice_prev, W) + b Z = np.sum(s) return Z # Test it np.random.seed(1) #Here the slice size is the same as the filter size a_slice_prev = np.random.randn(4,4,3) W = np.random.randn(4,4,3) b = np.random.randn(1,1,1) Z = conv_single_step(a_slice_prev,W,b) print("Z = " + str(Z))
The final test output is
Forward transfer of convolutional neural networks
In forward transfer, you will use multiple filters to convolute the inputs. Each "convolution" will output a 2D matrix. Then, you will stack these outputs to obtain a higher dimensional matrix.
To implement custom slicing, we can do this: first define the positions to be sliced, vert_start, vert_end, horiz_start and horiz_end. We can see their positions by looking at the following figure.
The forward propagation function is defined and implemented using a for loop.
# Define forward propagation function def conv_forward(A_prev, W, b, hparameters): """ Forward propagation of convolution function Parameters: A_prev - The activation output matrix of the upper layer, with dimension(m, n_H_prev, n_W_prev, n_C_prev),(Number of samples, height of image on the previous layer, width of image on the previous layer, number of filters on the previous layer) W - Weight matrix, dimension(f, f, n_C_prev, n_C),(Filter size, filter size, number of filters in the previous layer, number of filters in this layer) b - Offset matrix, dimension(1, 1, 1, n_C),(1,1,1,Number of filters in this layer) hparameters - Contains"stride"And "pad"Super parameter dictionary. stride Is the step size of the filter pad Is the number of fills in the image padding return: Z - Convolution output, dimension(m, n_H, n_W, n_C),(Number of samples, image height, image width, number of filters) cache - Some back propagation functions are cached conv_backward()Some data needed """ # Get basic information from the data of the previous layer (m, n_H_prev, n_W_prev, n_C_prev) = A_prev.shape # Get the basic information of the weight matrix (f, f, n_C_prev, n_C) = W.shape # Gets the value of the hyperparameter hpparameters stride = hparameters["stride"] pad = hparameters["pad"] # Calculate the width and height of the convoluted image, refer to the above formula, and use int() to divide the plate n_H = int((n_H_prev - f + 2 * pad) / stride) + 1 n_W = int((n_W_prev - f + 2 * pad) / stride) + 1 # Use 0 to initialize the convolution output Z Z = np.zeros((m, n_H, n_W, n_C)) # Pass A_prev creates a filled A_prev_pad A_prev_pad = zero_pad(A_prev, pad) for i in range(m): # Traversal sample a_prev_pad = A_prev_pad[i] # Select the expanded activation matrix of the ith sample for h in range(n_H): # Cycle on the vertical axis of the output for w in range(n_W): # Cycle on the horizontal axis of the output for c in range(n_C): # Loop through the output channel # Locate the current slice position vert_start = h * stride # Vertical, starting position vert_end = vert_start + f # Vertical, end position horiz_start = w * stride # Horizontal, starting position horiz_end = horiz_start + f # Horizontal, end position # Once the slice position is located, we take it out. It should be noted that we take it out "through", # Just mend your brain and insert a straw into a layer of plasticine a_slice_prev = a_prev_pad[vert_start:vert_end, horiz_start:horiz_end, :] # Perform one-step convolution Z[i, h, w, c] = conv_single_step(a_slice_prev, W[:, :, :, c], b[0, 0, 0, c]) # After data processing, verify whether the data format is correct assert (Z.shape == (m, n_H, n_W, n_C)) # Store some cached values for back propagation cache = (A_prev, W, b, hparameters) return (Z, cache) # Test it np.random.seed(1) A_prev = np.random.randn(10,4,4,3) W = np.random.randn(2,2,3,8) b = np.random.randn(1,1,1,8) hparameters = {"pad" : 2, "stride": 1} Z , cache_conv = conv_forward(A_prev,W,b,hparameters) print("np.mean(Z) = ", np.mean(Z)) print("cache_conv[0][1][2][3] =", cache_conv[0][1][2][3])
The test output is as follows
Pool layer
The POOL layer reduces the height and width of the input. It helps to reduce the amount of calculation and keep the position of the feature detector in the input unchanged. There are two types of POOL layer:
-
Maximize pooling: slide the (F, f) window over the input and store the maximum value of the window in the output.
-
Average pooling: slide the (F, f) window over the input and store the average value of the window in the output.
The pool layer has a window size f, which is a super parameter. Specifies the height and width of the window to calculate the maximum or average value. Define the forward propagation of the pooled layer and test the output.
# Define pooled layer forward propagation def pool_forward(A_prev, hparameters, mode="max"): """ Realize forward propagation of pooling layer Parameters: A_prev - Input data, dimension is(m, n_H_prev, n_W_prev, n_C_prev) hparameters - Contains "f" and "stride"Super parameter dictionary mode - Mode selection["max" | "average"] return: A - The output of the pooling layer. The dimension is (m, n_H, n_W, n_C) cache - It stores some values needed for back propagation, including a dictionary of input and super parameters. """ # Get basic information of input data (m, n_H_prev, n_W_prev, n_C_prev) = A_prev.shape # Get the information of the super parameter f = hparameters["f"] stride = hparameters["stride"] # Calculate output dimension n_H = int((n_H_prev - f) / stride) + 1 n_W = int((n_W_prev - f) / stride) + 1 n_C = n_C_prev # Initialize output matrix A = np.zeros((m, n_H, n_W, n_C)) for i in range(m): # Traversal sample for h in range(n_H): # Cycle on the vertical axis of the output for w in range(n_W): # Cycle on the horizontal axis of the output for c in range(n_C): # Loop through the output channel # Locate the current slice position vert_start = h * stride # Vertical, starting position vert_end = vert_start + f # Vertical, end position horiz_start = w * stride # Horizontal, starting position horiz_end = horiz_start + f # Horizontal, end position # After positioning, start cutting a_slice_prev = A_prev[i, vert_start:vert_end, horiz_start:horiz_end, c] # Pool slices if mode == "max": A[i, h, w, c] = np.max(a_slice_prev) elif mode == "average": A[i, h, w, c] = np.mean(a_slice_prev) # After pooling, verify the data format assert (A.shape == (m, n_H, n_W, n_C)) # After verification, start storing values for back propagation cache = (A_prev, hparameters) return A, cache # Test it np.random.seed(1) A_prev = np.random.randn(2,4,4,3) hparameters = {"f":4 , "stride":1} A , cache = pool_forward(A_prev,hparameters,mode="max") A, cache = pool_forward(A_prev, hparameters) print("mode = max") print("A =", A) print("----------------------------") A, cache = pool_forward(A_prev, hparameters, mode = "average") print("mode = average") print("A =", A)
The test output is as follows:
Back propagation of convolutional neural networks
In the deep learning framework, you only need to implement forward propagation, and the framework can handle back propagation. Therefore, most deep learning engineers do not need to pay attention to the details of back propagation. The back propagation of convolutional networks is very complex.
Back propagation of pooling layer
Even if the pooled layer has no parameters for back propagation update, you still need to back propagate the gradient through the pooled layer in order to calculate the gradient of the layer before the pooled layer. Before back propagation into the pooled layer, first build a named create_ mask_ from_ The auxiliary function of window(), which performs the following operations:
This function creates a "Mask" matrix that tracks the maximum value of the matrix. True (1) indicates the position of the maximum value in X, and other entries are False (0).
This function is defined as follows
# Define auxiliary reverse pooling function def create_mask_from_window(x): """ Create a mask from the input matrix to save the location of the matrix with the maximum value. Parameters: x - One dimension is(f,f)Matrix of return: mask - contain x The matrix of the position of the maximum value of """ mask = x == np.max(x) return mask # Test it np.random.seed(1) x = np.random.randn(2,3) mask = create_mask_from_window(x) print("x = " + str(x)) print("mask = " + str(mask))
The test output is as follows
Define pooled layer backpropagation function
def pool_backward(dA,cache,mode = "max"): """ Implement back propagation of pooled layer parameter: dA - The gradient of the output of the pool layer is the same as the dimension of the output of the pool layer cache - Parameters stored when the pooling layer propagates forward. mode - Mode selection["max" | "average"] return: dA_prev - The gradient of the input of the pool layer, and A_prev The dimensions of are the same """ #Gets the value in the cache (A_prev , hparaeters) = cache #Gets the value of hparaeters f = hparaeters["f"] stride = hparaeters["stride"] #Get basic information of A_prev and dA (m , n_H_prev , n_W_prev , n_C_prev) = A_prev.shape (m , n_H , n_W , n_C) = dA.shape #Structure of initialization output dA_prev = np.zeros_like(A_prev) #Start processing data for i in range(m): a_prev = A_prev[i] for h in range(n_H): for w in range(n_W): for c in range(n_C): #Position slice vert_start = h vert_end = vert_start + f horiz_start = w horiz_end = horiz_start + f #Select the calculation method of back propagation if mode == "max": #Start slicing a_prev_slice = a_prev[vert_start:vert_end,horiz_start:horiz_end,c] #Create mask mask = create_mask_from_window(a_prev_slice) #Calculate dA_prev dA_prev[i,vert_start:vert_end,horiz_start:horiz_end,c] += np.multiply(mask,dA[i,h,w,c]) elif mode == "average": #Get the value of dA da = dA[i,h,w,c] #Define filter size shape = (f,f) #Average distribution dA_prev[i,vert_start:vert_end, horiz_start:horiz_end ,c] += distribute_value(da,shape) #After data processing, start format verification assert(dA_prev.shape == A_prev.shape) return dA_prev
Test it
# Test it np.random.seed(1) A_prev = np.random.randn(5, 5, 3, 2) hparameters = {"stride" : 1, "f": 2} A, cache = pool_forward(A_prev, hparameters) dA = np.random.randn(5, 4, 2, 2) dA_prev = pool_backward(dA, cache, mode = "max") print("mode = max") print('mean of dA = ', np.mean(dA)) print('dA_prev[1,1] = ', dA_prev[1,1]) print()
The test output results are as follows
Realization of convolutional neural network by Tensorflow
Create a new tf_abcd ji.py file.
Import related libraries and download related data
Download address After clicking, enter the download address, and then download the data set and necessary files.
The import library and dataset codes are as follows
import math import numpy as np import h5py import matplotlib.pyplot as plt import scipy from PIL import Image from scipy import ndimage import tensorflow.compat.v1 as tf tf.disable_eager_execution() from tensorflow.python.framework import ops from cnn_utils import * import cnn_utils # Load dataset X_train_orig, Y_train_orig, X_test_orig, Y_test_orig, classes = cnn_utils.load_dataset() # View a picture of the dataset index = 6 plt.imshow(X_train_orig[index]) plt.show() print("y = " + str(np.squeeze(Y_train_orig[:, index])))
The results are as follows
The signals data set is a picture set of 6 gesture symbols, representing numbers from 0 to 5.
Check dataset dimensions
# Check the dimensions of the dataset X_train = X_train_orig/255. X_test = X_test_orig/255. Y_train = convert_to_one_hot(Y_train_orig, 6).T Y_test = convert_to_one_hot(Y_test_orig, 6).T
Create placeholder
TensorFlow requires you to create placeholders for the input data that will be input into the model when running the session. Now we want to implement the function to create placeholders, because we use small batch data blocks, and the number of input samples may not be fixed, so we want to use None as a variable quantity. Therefore, the dimension of X should be [None, n_H0, n_W0, n_C0] , Y should be [None, n_y].
Define create placeholder function
# Define create placeholder function def create_placeholders(n_H0, n_W0, n_C0, n_y): """ by session Create placeholder Parameters: n_H0 - Real number, enter the height of the image n_W0 - Real number, enter the width of the image n_C0 - Real number, the number of channels entered n_y - Real number Output: X - Placeholder for input data, dimension is[None, n_H0, n_W0, n_C0],Type is"float" Y - Placeholder for label of input data, dimension is[None, n_y],Dimension is"float" """ X = tf.compat.v1.placeholder(tf.float32, [None, n_H0, n_W0, n_C0]) Y = tf.compat.v1.placeholder(tf.float32, [None, n_y]) return X, Y # Test it X , Y = create_placeholders(64,64,3,6) print ("X = " + str(X)) print ("Y = " + str(Y))
The test output is as follows
Initialization parameters
You will use tf.contrib.layers.xavier_initializer (seed = 0) to initialize the weight / filter and. You don't have to worry about the deviation variable because the TensorFlow function can handle the deviation. Also note that you will only initialize the weight / filter for the conv2d function, and TensorFlow will automatically initialize the layers of the fully connected part.
Define initialization parameters
# Initialization parameters def initialize_parameters(): """ Initialize the weight matrix. Here we hard code the weight matrix: W1 : [4, 4, 3, 8] W2 : [2, 2, 8, 16] return: Contains tensor Type W1,W2 Dictionary of """ tf.compat.v1.set_random_seed(1) W1 = tf.compat.v1.get_variable("W1", [4, 4, 3, 8], initializer=tf.keras.initializers.glorot_normal(seed=0)) W2 = tf.compat.v1.get_variable("W2", [2, 2, 8, 16], initializer=tf.keras.initializers.glorot_normal(seed=0)) parameters = {"W1": W1, "W2": W2} return parameters #Test it tf.compat.v1.reset_default_graph() with tf.compat.v1.Session() as sess_test: parameters = initialize_parameters() init = tf.compat.v1.global_variables_initializer() sess_test.run(init) print("W1 = " + str(parameters["W1"].eval()[1, 1, 1])) print("W2 = " + str(parameters["W2"].eval()[1, 1, 1])) sess_test.close()
The test output is as follows
Forward propagation
In TensorFlow, there are built-in functions that can be called directly to perform convolution steps.
- Tf.nn.conv2d (x, W1, stripes = [1, s, s, 1], padding = 'SAME'): given input X and a set of filters W1, the function will convolute x with the filter of W1. The third input ([1,f,f,1]) represents the step size of each dimension (m, n_H_prev, n_W_prev, n_C_prev) of the input.
- Tf.nn.max_pool (A, ksize = [1, F, F, 1], stripes = [1, s, s, 1], padding = 'SAME'): given the input A, this function maximizes the pool on each window using A window of size (F, f) and A step of size (s, s).
- tf.nn.relu(Z1): calculate the ReLU activation output of Z1 (can be any shape).
- tf.contrib.layers.flatten §: given the input P, this function flattens each example into a one-dimensional vector while maintaining the batch size. It returns the flattening tensor with the dimension [batch_size, k].
- tf.contrib.layers.fully_connected(F, num_outputs): given the flattened input F, it will return the output calculated with the full connection layer.
When we implement forward propagation, we need to define the general appearance of our model:
Specifically, we will use the following parameters in all steps:- Conv2D: stride 1, fill with "SAME"
- ReLU
- Max pool: fill with "SAME" using 8x8 filter and 8x8 stride
- Conv2D: stride 1, fill with "SAME"
- ReLU
- Max pool: fill with "SAME" using 4x4 filter and 4x4 stride
- Output before flattening.
- FULLYCONNECTED (FC) layer: apply a fully connected layer without nonlinear activation function. Do not call softmax here. This will generate 6 neurons in the output layer and then pass them to softmax. In TensorFlow, softmax and cost functions are combined into one function, and another function will be called when calculating loss.
Define forward propagation function
# Define forward propagation function def forward_propagation(X, parameters): """ Realize forward propagation CONV2D -> RELU -> MAXPOOL -> CONV2D -> RELU -> MAXPOOL -> FLATTEN -> FULLYCONNECTED Parameters: X - Of input data placeholder,Dimension is(Enter the number of nodes and samples) parameters - Contains“ W1"And“ W2"of python Dictionaries. return: Z3 - the last one LINEAR Output of node """ W1 = parameters['W1'] W2 = parameters['W2'] # Conv2d: step: 1, filling method: "SAME" Z1 = tf.nn.conv2d(X, W1, strides=[1, 1, 1, 1], padding="SAME") # ReLU : A1 = tf.nn.relu(Z1) # Max pool: window size: 8x8, step: 8x8, filling method: "SAME" P1 = tf.nn.max_pool(A1, ksize=[1, 8, 8, 1], strides=[1, 8, 8, 1], padding="SAME") # Conv2d: step: 1, filling method: "SAME" Z2 = tf.nn.conv2d(P1, W2, strides=[1, 1, 1, 1], padding="SAME") # ReLU : A2 = tf.nn.relu(Z2) # Max pool: filter size: 4x4, step: 4x4, filling method: "SAME" P2 = tf.nn.max_pool(A2, ksize=[1, 4, 4, 1], strides=[1, 4, 4, 1], padding="SAME") # One dimensional output of the upper layer P = tf.compat.v1.layers.flatten(P2) # Full connection layer (FC): use the full connection layer without nonlinear activation function Z3 = tf.compat.v1.layers.dense(P, 6) return Z3 # Test it tf.compat.v1.reset_default_graph() np.random.seed(1) with tf.compat.v1.Session() as sess_test: X, Y = create_placeholders(64, 64, 3, 6) parameters = initialize_parameters() Z3 = forward_propagation(X, parameters) init = tf.compat.v1.global_variables_initializer() sess_test.run(init) a = sess_test.run(Z3, {X: np.random.randn(2, 64, 64, 3), Y: np.random.randn(2, 6)}) print("Z3 = " + str(a)) sess_test.close()
Test output
Calculate loss
- tf.nn.softmax_cross_entropy_with_logits(logits = Z3, labels = Y): calculate the softmax entropy loss. This function will calculate the softmax activation function and the resulting loss.
- tf.reduce_mean: calculate the mean value of the elements in each dimension of the tensor and use it to sum the losses of all training examples to obtain the total loss.
# Calculate loss def compute_cost(Z3, Y): """ Calculate cost Parameters: Z3 - Forward propagation last LINEAR The output dimension of the node is (6, number of samples). Y - Label vector placeholder,and Z3 The dimensions of are the same return: cost - Calculated cost """ cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=Z3, labels=Y)) return cost # Test it tf.compat.v1.reset_default_graph() with tf.compat.v1.Session() as sess_test: np.random.seed(1) X, Y = create_placeholders(64, 64, 3, 6) parameters = initialize_parameters() Z3 = forward_propagation(X, parameters) cost = compute_cost(Z3, Y) init = tf.compat.v1.global_variables_initializer() sess_test.run(init) a = sess_test.run(cost, {X: np.random.randn(4, 64, 64, 3), Y: np.random.randn(4, 6)}) print("cost = " + str(a)) sess_test.close()
Test output
Build model
You will combine the auxiliary functions implemented above to build the model and train it on the signals dataset. The model should have the following operations
- Create placeholder
- Initialization parameters
- Forward propagation
- Calculate loss
- Create optimization function
Finally, you will create a session for num_epochs runs a for loop to get the small batch processing, and then runs the optimization function for each small batch.
The definition model is as follows
# Build model def model(X_train, Y_train, X_test, Y_test, learning_rate=0.009,num_epochs=100, minibatch_size=64, print_cost=True, isPlot=True): """ use TensorFlow Implementation of three-layer convolutional neural network CONV2D -> RELU -> MAXPOOL -> CONV2D -> RELU -> MAXPOOL -> FLATTEN -> FULLYCONNECTED Parameters: X_train - Training data, dimension is(None, 64, 64, 3) Y_train - The label corresponding to the training data. The dimension is(None, n_y = 6) X_test - Test data, dimension is(None, 64, 64, 3) Y_test - The label corresponding to the training data. The dimension is(None, n_y = 6) learning_rate - Learning rate num_epochs - Number of times to traverse the entire dataset minibatch_size - Size of each small batch data block print_cost - Whether to print the cost value. Print the entire data set every 100 times isPlot - Map drawn return: train_accuracy - Real number, accuracy of training set test_accuracy - Real number, accuracy of test set parameters - Parameters after learning """ ops.reset_default_graph() # Ability to rerun the model without overwriting tf variables tf.compat.v1.set_random_seed(1) # Make sure your data is the same as mine seed = 3 # Specifies the random seed of numpy (m, n_H0, n_W0, n_C0) = X_train.shape n_y = Y_train.shape[1] costs = [] # Creates a placeholder for the current dimension X, Y = create_placeholders(n_H0, n_W0, n_C0, n_y) # Initialization parameters parameters = initialize_parameters() # Forward propagation Z3 = forward_propagation(X, parameters) # Calculate cost cost = compute_cost(Z3, Y) # Back propagation. Since the framework has implemented back propagation, we only need to select an optimizer optimizer = tf.compat.v1.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost) # Initialize all variables globally init = tf.compat.v1.global_variables_initializer() # Start running with tf.compat.v1.Session() as sess: # Initialization parameters sess.run(init) # Start traversing dataset for epoch in range(num_epochs): minibatch_cost = 0 num_minibatches = int(m / minibatch_size) # Gets the number of data blocks seed = seed + 1 minibatches = cnn_utils.random_mini_batches(X_train, Y_train, minibatch_size, seed) # Process each data block for minibatch in minibatches: # Select a data block (minibatch_X, minibatch_Y) = minibatch # Minimize the cost of this data block _, temp_cost = sess.run([optimizer, cost], feed_dict={X: minibatch_X, Y: minibatch_Y}) # Accumulate the cost value of the data block minibatch_cost += temp_cost / num_minibatches # Print cost if print_cost: # Print every 5 generations if epoch % 5 == 0: print("The current is the second " + str(epoch) + " Generation, the cost value is:" + str(minibatch_cost)) # Record cost if epoch % 1 == 0: costs.append(minibatch_cost) # After data processing, draw the cost curve if isPlot: plt.plot(np.squeeze(costs)) plt.ylabel('cost') plt.xlabel('iterations (per tens)') plt.title("Learning rate =" + str(learning_rate)) plt.show() # Start forecast data ## Calculate current forecast predict_op = tf.argmax(Z3, 1) corrent_prediction = tf.equal(predict_op, tf.argmax(Y, 1)) ##Calculation accuracy accuracy = tf.reduce_mean(tf.cast(corrent_prediction, "float")) print("corrent_prediction accuracy= " + str(accuracy)) train_accuracy = accuracy.eval({X: X_train, Y: Y_train}) test_accuary = accuracy.eval({X: X_test, Y: Y_test}) print("Training set accuracy:" + str(train_accuracy)) print("Test set accuracy:" + str(test_accuary)) return (train_accuracy, test_accuary, parameters)
After starting the model, the output is as follows
# Start model _, _, parameters = model(X_train,Y_train, X_test,Y_test,num_epochs=150)
Complete code
juan_ji.py
import numpy as np import h5py import matplotlib.pyplot as plt plt.rcParams['figure.figsize'] = (5.0, 4.0) plt.rcParams['image.interpolation'] = 'nearest' plt.rcParams['image.cmap'] = 'gray' # Define zero fill function def zero_pad(X, pad): """ Put the data set X All image boundaries are expanded with 0 pad Width and height. Parameters: X - Image dataset, dimension is (number of samples, image height, image width, number of image channels) pad - Integer, the filling amount of each image in the vertical and horizontal dimensions return: X_paded - The dimension of the expanded image data set is (number of samples, image height) + 2*pad,Image width + 2*pad,Number of image channels) """ X_paded = np.pad(X, ( (0, 0), # Number of samples, not filled (pad, pad), # Image height, you can think of filling x above and y below (x,y) (pad, pad), # Image width, you can fill x on the left and Y on the right (x,y) (0, 0)), # Number of channels, not filled 'constant', constant_values=0) # Continuous same value filling return X_paded # # Test it # np.random.seed(1) # x = np.random.randn(4,3,3,2) # x_paded = zero_pad(x,2) # #see information # print ("x.shape =", x.shape) # print ("x_paded.shape =", x_paded.shape) # print ("x[1, 1] =", x[1, 1]) # print ("x_paded[1, 1] =", x_paded[1, 1]) # #Drawing # fig, axarr = plt.subplots(1,2) #One row and two columns # axarr[0].set_title('x') # axarr[0].imshow(x[0,:,:,0]) # axarr[1].set_title('x_paded') # axarr[1].imshow(x_paded[0,:,:,0]) # Define convolution operation def conv_single_step(a_slice_prev, W, b): """ A parameter is applied to a segment of the active output of the previous layer W Defined filters. Here the slice size is the same as the filter size Parameters: a_slice_prev - A segment of input data with dimension (filter size, filter size, number of previous channels) W - Weight parameter, contained in a matrix, with dimensions (filter size, filter size, number of previous channels) b - Offset parameter, contained in a matrix, with dimension (1),1,1) return: Z - In the slice of input data X Upper convolution sliding window( w,b)Results. """ s = np.multiply(a_slice_prev, W) + b Z = np.sum(s) return Z # # Test it # np.random.seed(1) # #Here the slice size is the same as the filter size # a_slice_prev = np.random.randn(4,4,3) # W = np.random.randn(4,4,3) # b = np.random.randn(1,1,1) # # Z = conv_single_step(a_slice_prev,W,b) # # print("Z = " + str(Z)) # Define forward propagation function def conv_forward(A_prev, W, b, hparameters): """ Forward propagation of convolution function Parameters: A_prev - The activation output matrix of the upper layer, with dimension(m, n_H_prev, n_W_prev, n_C_prev),(Number of samples, height of image on the previous layer, width of image on the previous layer, number of filters on the previous layer) W - Weight matrix, dimension(f, f, n_C_prev, n_C),(Filter size, filter size, number of filters in the previous layer, number of filters in this layer) b - Offset matrix, dimension(1, 1, 1, n_C),(1,1,1,Number of filters in this layer) hparameters - Contains"stride"And "pad"Super parameter dictionary. stride Is the step size of the filter pad Is the number of fills in the image padding return: Z - Convolution output, dimension(m, n_H, n_W, n_C),(Number of samples, image height, image width, number of filters) cache - Some back propagation functions are cached conv_backward()Some data needed """ # Get basic information from the data of the previous layer (m, n_H_prev, n_W_prev, n_C_prev) = A_prev.shape # Get the basic information of the weight matrix (f, f, n_C_prev, n_C) = W.shape # Gets the value of the hyperparameter hpparameters stride = hparameters["stride"] pad = hparameters["pad"] # Calculate the width and height of the convoluted image, refer to the above formula, and use int() to divide the plate n_H = int((n_H_prev - f + 2 * pad) / stride) + 1 n_W = int((n_W_prev - f + 2 * pad) / stride) + 1 # Use 0 to initialize the convolution output Z Z = np.zeros((m, n_H, n_W, n_C)) # Pass A_prev creates a filled A_prev_pad A_prev_pad = zero_pad(A_prev, pad) for i in range(m): # Traversal sample a_prev_pad = A_prev_pad[i] # Select the expanded activation matrix of the ith sample for h in range(n_H): # Cycle on the vertical axis of the output for w in range(n_W): # Cycle on the horizontal axis of the output for c in range(n_C): # Loop through the output channel # Locate the current slice position vert_start = h * stride # Vertical, starting position vert_end = vert_start + f # Vertical, end position horiz_start = w * stride # Horizontal, starting position horiz_end = horiz_start + f # Horizontal, end position # Once the slice position is located, we take it out. It should be noted that we take it out "through", # Just mend your brain and insert a straw into a layer of plasticine a_slice_prev = a_prev_pad[vert_start:vert_end, horiz_start:horiz_end, :] # Perform one-step convolution Z[i, h, w, c] = conv_single_step(a_slice_prev, W[:, :, :, c], b[0, 0, 0, c]) # After data processing, verify whether the data format is correct assert (Z.shape == (m, n_H, n_W, n_C)) # Store some cached values for back propagation cache = (A_prev, W, b, hparameters) return (Z, cache) # # Test it # np.random.seed(1) # # A_prev = np.random.randn(10,4,4,3) # W = np.random.randn(2,2,3,8) # b = np.random.randn(1,1,1,8) # # hparameters = {"pad" : 2, "stride": 1} # # Z , cache_conv = conv_forward(A_prev,W,b,hparameters) # # print("np.mean(Z) = ", np.mean(Z)) # print("cache_conv[0][1][2][3] =", cache_conv[0][1][2][3]) # Define pooled layer forward propagation def pool_forward(A_prev, hparameters, mode="max"): """ Realize forward propagation of pooling layer Parameters: A_prev - Input data, dimension is(m, n_H_prev, n_W_prev, n_C_prev) hparameters - Contains "f" and "stride"Super parameter dictionary mode - Mode selection["max" | "average"] return: A - The output of the pooling layer. The dimension is (m, n_H, n_W, n_C) cache - It stores some values needed for back propagation, including a dictionary of input and super parameters. """ # Get basic information of input data (m, n_H_prev, n_W_prev, n_C_prev) = A_prev.shape # Get the information of the super parameter f = hparameters["f"] stride = hparameters["stride"] # Calculate output dimension n_H = int((n_H_prev - f) / stride) + 1 n_W = int((n_W_prev - f) / stride) + 1 n_C = n_C_prev # Initialize output matrix A = np.zeros((m, n_H, n_W, n_C)) for i in range(m): # Traversal sample for h in range(n_H): # Cycle on the vertical axis of the output for w in range(n_W): # Cycle on the horizontal axis of the output for c in range(n_C): # Loop through the output channel # Locate the current slice position vert_start = h * stride # Vertical, starting position vert_end = vert_start + f # Vertical, end position horiz_start = w * stride # Horizontal, starting position horiz_end = horiz_start + f # Horizontal, end position # After positioning, start cutting a_slice_prev = A_prev[i, vert_start:vert_end, horiz_start:horiz_end, c] # Pool slices if mode == "max": A[i, h, w, c] = np.max(a_slice_prev) elif mode == "average": A[i, h, w, c] = np.mean(a_slice_prev) # After pooling, verify the data format assert (A.shape == (m, n_H, n_W, n_C)) # After verification, start storing values for back propagation cache = (A_prev, hparameters) return A, cache # # Test it # np.random.seed(1) # A_prev = np.random.randn(2,4,4,3) # hparameters = {"f":4 , "stride":1} # # A , cache = pool_forward(A_prev,hparameters,mode="max") # A, cache = pool_forward(A_prev, hparameters) # print("mode = max") # print("A =", A) # print("----------------------------") # A, cache = pool_forward(A_prev, hparameters, mode = "average") # print("mode = average") # print("A =", A) # Define auxiliary reverse pooling function def create_mask_from_window(x): """ Create a mask from the input matrix to save the location of the matrix with the maximum value. Parameters: x - One dimension is(f,f)Matrix of return: mask - contain x The matrix of the position of the maximum value of """ mask = x == np.max(x) return mask # # Test it # np.random.seed(1) # # x = np.random.randn(2,3) # # mask = create_mask_from_window(x) # # print("x = " + str(x)) # print("mask = " + str(mask)) # Define pooled layer backpropagation def pool_backward(dA, cache, mode="max"): """ Implement back propagation of pooled layer parameter: dA - The gradient of the output of the pool layer is the same as the dimension of the output of the pool layer cache - Parameters stored when the pooling layer propagates forward. mode - Mode selection["max" | "average"] return: dA_prev - The gradient of the input of the pool layer, and A_prev The dimensions of are the same """ # Gets the value in the cache (A_prev, hparaeters) = cache # Gets the value of hparaeters f = hparaeters["f"] stride = hparaeters["stride"] # Get basic information of A_prev and dA (m, n_H_prev, n_W_prev, n_C_prev) = A_prev.shape (m, n_H, n_W, n_C) = dA.shape # Structure of initialization output dA_prev = np.zeros_like(A_prev) # Start processing data for i in range(m): a_prev = A_prev[i] for h in range(n_H): for w in range(n_W): for c in range(n_C): # Position slice vert_start = h vert_end = vert_start + f horiz_start = w horiz_end = horiz_start + f # Select the calculation method of back propagation if mode == "max": # Start slicing a_prev_slice = a_prev[vert_start:vert_end, horiz_start:horiz_end, c] # Create mask mask = create_mask_from_window(a_prev_slice) # Calculate dA_prev dA_prev[i, vert_start:vert_end, horiz_start:horiz_end, c] += np.multiply(mask, dA[i, h, w, c]) elif mode == "average": # Get the value of dA da = dA[i, h, w, c] # Define filter size shape = (f, f) # Average distribution dA_prev[i, vert_start:vert_end, horiz_start:horiz_end, c] += distribute_value(da, shape) # After data processing, start format verification assert (dA_prev.shape == A_prev.shape) return dA_prev # Test it np.random.seed(1) A_prev = np.random.randn(5, 5, 3, 2) hparameters = {"stride" : 1, "f": 2} A, cache = pool_forward(A_prev, hparameters) dA = np.random.randn(5, 4, 2, 2) dA_prev = pool_backward(dA, cache, mode = "max") print("mode = max") print('mean of dA = ', np.mean(dA)) print('dA_prev[1,1] = ', dA_prev[1,1]) print()
tf_juan_ji.py
import math import numpy as np import h5py import matplotlib.pyplot as plt import scipy from PIL import Image from scipy import ndimage import tensorflow.compat.v1 as tf tf.disable_eager_execution() from tensorflow.python.framework import ops from cnn_utils import * import cnn_utils # Load dataset X_train_orig, Y_train_orig, X_test_orig, Y_test_orig, classes = cnn_utils.load_dataset() # # View a picture of the dataset # index = 6 # plt.imshow(X_train_orig[index]) # plt.show() # print("y = " + str(np.squeeze(Y_train_orig[:, index]))) # Check the dimensions of the dataset X_train = X_train_orig/255. X_test = X_test_orig/255. Y_train = convert_to_one_hot(Y_train_orig, 6).T Y_test = convert_to_one_hot(Y_test_orig, 6).T # Define create placeholder function def create_placeholders(n_H0, n_W0, n_C0, n_y): """ by session Create placeholder Parameters: n_H0 - Real number, enter the height of the image n_W0 - Real number, enter the width of the image n_C0 - Real number, the number of channels entered n_y - Real number Output: X - Placeholder for input data, dimension is[None, n_H0, n_W0, n_C0],Type is"float" Y - Placeholder for label of input data, dimension is[None, n_y],Dimension is"float" """ X = tf.compat.v1.placeholder(tf.float32, [None, n_H0, n_W0, n_C0]) Y = tf.compat.v1.placeholder(tf.float32, [None, n_y]) return X, Y # # Test it # X , Y = create_placeholders(64,64,3,6) # print ("X = " + str(X)) # print ("Y = " + str(Y)) # Initialization parameters def initialize_parameters(): """ Initialize the weight matrix. Here we hard code the weight matrix: W1 : [4, 4, 3, 8] W2 : [2, 2, 8, 16] return: Contains tensor Type W1,W2 Dictionary of """ tf.compat.v1.set_random_seed(1) W1 = tf.compat.v1.get_variable("W1", [4, 4, 3, 8], initializer=tf.keras.initializers.glorot_normal(seed=0)) W2 = tf.compat.v1.get_variable("W2", [2, 2, 8, 16], initializer=tf.keras.initializers.glorot_normal(seed=0)) parameters = {"W1": W1, "W2": W2} return parameters # #Test it # tf.compat.v1.reset_default_graph() # with tf.compat.v1.Session() as sess_test: # parameters = initialize_parameters() # init = tf.compat.v1.global_variables_initializer() # sess_test.run(init) # print("W1 = " + str(parameters["W1"].eval()[1, 1, 1])) # print("W2 = " + str(parameters["W2"].eval()[1, 1, 1])) # # sess_test.close() # Define forward propagation function def forward_propagation(X, parameters): """ Realize forward propagation CONV2D -> RELU -> MAXPOOL -> CONV2D -> RELU -> MAXPOOL -> FLATTEN -> FULLYCONNECTED Parameters: X - Of input data placeholder,Dimension is(Enter the number of nodes and samples) parameters - Contains“ W1"And“ W2"of python Dictionaries. return: Z3 - the last one LINEAR Output of node """ W1 = parameters['W1'] W2 = parameters['W2'] # Conv2d: step: 1, filling method: "SAME" Z1 = tf.nn.conv2d(X, W1, strides=[1, 1, 1, 1], padding="SAME") # ReLU : A1 = tf.nn.relu(Z1) # Max pool: window size: 8x8, step: 8x8, filling method: "SAME" P1 = tf.nn.max_pool(A1, ksize=[1, 8, 8, 1], strides=[1, 8, 8, 1], padding="SAME") # Conv2d: step: 1, filling method: "SAME" Z2 = tf.nn.conv2d(P1, W2, strides=[1, 1, 1, 1], padding="SAME") # ReLU : A2 = tf.nn.relu(Z2) # Max pool: filter size: 4x4, step: 4x4, filling method: "SAME" P2 = tf.nn.max_pool(A2, ksize=[1, 4, 4, 1], strides=[1, 4, 4, 1], padding="SAME") # One dimensional output of the upper layer P = tf.compat.v1.layers.flatten(P2) # Full connection layer (FC): use the full connection layer without nonlinear activation function Z3 = tf.compat.v1.layers.dense(P, 6) return Z3 # # Test it # tf.compat.v1.reset_default_graph() # np.random.seed(1) # # with tf.compat.v1.Session() as sess_test: # X, Y = create_placeholders(64, 64, 3, 6) # parameters = initialize_parameters() # Z3 = forward_propagation(X, parameters) # # init = tf.compat.v1.global_variables_initializer() # sess_test.run(init) # # a = sess_test.run(Z3, {X: np.random.randn(2, 64, 64, 3), Y: np.random.randn(2, 6)}) # print("Z3 = " + str(a)) # # sess_test.close() # Calculate loss def compute_cost(Z3, Y): """ Calculate cost Parameters: Z3 - Forward propagation last LINEAR The output dimension of the node is (6, number of samples). Y - Label vector placeholder,and Z3 The dimensions of are the same return: cost - Calculated cost """ cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=Z3, labels=Y)) return cost # # Test it # tf.compat.v1.reset_default_graph() # # with tf.compat.v1.Session() as sess_test: # np.random.seed(1) # X, Y = create_placeholders(64, 64, 3, 6) # parameters = initialize_parameters() # Z3 = forward_propagation(X, parameters) # cost = compute_cost(Z3, Y) # # init = tf.compat.v1.global_variables_initializer() # sess_test.run(init) # a = sess_test.run(cost, {X: np.random.randn(4, 64, 64, 3), Y: np.random.randn(4, 6)}) # print("cost = " + str(a)) # # sess_test.close() # Build model def model(X_train, Y_train, X_test, Y_test, learning_rate=0.009,num_epochs=100, minibatch_size=64, print_cost=True, isPlot=True): """ use TensorFlow Implementation of three-layer convolutional neural network CONV2D -> RELU -> MAXPOOL -> CONV2D -> RELU -> MAXPOOL -> FLATTEN -> FULLYCONNECTED Parameters: X_train - Training data, dimension is(None, 64, 64, 3) Y_train - The label corresponding to the training data. The dimension is(None, n_y = 6) X_test - Test data, dimension is(None, 64, 64, 3) Y_test - The label corresponding to the training data. The dimension is(None, n_y = 6) learning_rate - Learning rate num_epochs - Number of times to traverse the entire dataset minibatch_size - Size of each small batch data block print_cost - Whether to print the cost value. Print the entire data set every 100 times isPlot - Map drawn return: train_accuracy - Real number, accuracy of training set test_accuracy - Real number, accuracy of test set parameters - Parameters after learning """ ops.reset_default_graph() # Ability to rerun the model without overwriting tf variables tf.compat.v1.set_random_seed(1) # Make sure your data is the same as mine seed = 3 # Specifies the random seed of numpy (m, n_H0, n_W0, n_C0) = X_train.shape n_y = Y_train.shape[1] costs = [] # Creates a placeholder for the current dimension X, Y = create_placeholders(n_H0, n_W0, n_C0, n_y) # Initialization parameters parameters = initialize_parameters() # Forward propagation Z3 = forward_propagation(X, parameters) # Calculate cost cost = compute_cost(Z3, Y) # Back propagation. Since the framework has implemented back propagation, we only need to select an optimizer optimizer = tf.compat.v1.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost) # Initialize all variables globally init = tf.compat.v1.global_variables_initializer() # Start running with tf.compat.v1.Session() as sess: # Initialization parameters sess.run(init) # Start traversing dataset for epoch in range(num_epochs): minibatch_cost = 0 num_minibatches = int(m / minibatch_size) # Gets the number of data blocks seed = seed + 1 minibatches = cnn_utils.random_mini_batches(X_train, Y_train, minibatch_size, seed) # Process each data block for minibatch in minibatches: # Select a data block (minibatch_X, minibatch_Y) = minibatch # Minimize the cost of this data block _, temp_cost = sess.run([optimizer, cost], feed_dict={X: minibatch_X, Y: minibatch_Y}) # Accumulate the cost value of the data block minibatch_cost += temp_cost / num_minibatches # Print cost if print_cost: # Print every 5 generations if epoch % 5 == 0: print("The current is the second " + str(epoch) + " Generation, the cost value is:" + str(minibatch_cost)) # Record cost if epoch % 1 == 0: costs.append(minibatch_cost) # After data processing, draw the cost curve if isPlot: plt.plot(np.squeeze(costs)) plt.ylabel('cost') plt.xlabel('iterations (per tens)') plt.title("Learning rate =" + str(learning_rate)) plt.show() # Start forecast data ## Calculate current forecast predict_op = tf.argmax(Z3, 1) corrent_prediction = tf.equal(predict_op, tf.argmax(Y, 1)) ##Calculation accuracy accuracy = tf.reduce_mean(tf.cast(corrent_prediction, "float")) print("corrent_prediction accuracy= " + str(accuracy)) train_accuracy = accuracy.eval({X: X_train, Y: Y_train}) test_accuary = accuracy.eval({X: X_test, Y: Y_test}) print("Training set accuracy:" + str(train_accuracy)) print("Test set accuracy:" + str(test_accuary)) return (train_accuracy, test_accuary, parameters) # Start model _, _, parameters = model(X_train,Y_train, X_test,Y_test,num_epochs=150)