Author home page( Silicon based workshop of slow fire rock sugar): Slow fire rock sugar (Wang Wenbing) blog silicon based workshop of slow fire rock sugar _csdnblog
Website of this article: https://blog.csdn.net/HiWangWenBing/article/details/120607797
catalogue
Introduction deep learning model framework
Chapter 1 business area analysis
one point one Step 1-1: business domain analysis
1.2 steps 1-2: Business Modeling
one point six Code instance preconditions
Chapter 2 definition of forward operation model
two point one Step 2-1: dataset selection
two point two Step 2-2: Data Preprocessing
2.3 step 2-3: neural network modeling
2.4 steps 2-4: neural network output
Chapter 3 definition of backward operation model
3.1 step 3-1: define the loss function
three point two Step 3-2: define the optimizer
three point four Step 3-4: Model Visualization
3.5 step 3-5: model validation
four point two Step 4-2: model loading
Introduction deep learning model framework
Chapter 1 business area analysis
one point one Step 1-1: business domain analysis
(1) Business requirements
Objective requirements:
Given any handwritten numeral graph, which numeral does it belong to?
(2) Business analysis
The essence of this task is the multi classification in logical classification, and the 10 classification problem in multi classification, that is, given the characteristic data of a graph (here is the single channel pixel value of a single graph), we can judge which digital classification it belongs to.
For this case, the shallow fully connected network is selected, and the deep neural network is not required.
1.2 steps 1-2: Business Modeling
(1) Single layer neural network
Input: 784
Output: 10
(2) Two layer neural network
First layer (hidden layer): 748 * m + m parameters; 784 = 32 * 32
Second layer (output layer): m * 10 + 10 parameters
When m = 256, common parameters:
First layer (hidden layer): 191744
Layer 2 (output layer): 2,570
Total ~ = 200000 parameters
(3) Three layer neural network
A three-layer neural network with two hidden layers is defined
L0 (input layer): X: 784 = 32 * 32
L1 (hidden layer 1): L1 = X*W1 + B1 : 784 * 256 + 256 => 200,960
L2 (hidden layer 2): L2 = L1 * W2 + B2: 256 * 64 + 64 => 16,448
L3 (output layer): L3 = L2 * W3 + B3: 64 * 10 + 10 = > 650
About 220000 parameters in total
(4) Selection of activation function
1.3 training model
1.4 validation model
1.5 overall structure
one point six Code instance preconditions
#Environmental preparation import numpy as np # numpy array library import math # Mathematical operation Library import matplotlib.pyplot as plt # Drawing library import torch # torch base library import torch.nn as nn # torch neural network library import torch.nn.functional as F # torch neural network library from sklearn.datasets import load_boston from sklearn.preprocessing import MinMaxScaler from sklearn.model_selection import train_test_split print("Hello World") print(torch.__version__) print(torch.cuda.is_available())
Hello World 1.8.0 False
Chapter 2 definition of forward operation model
two point one Step 2-1: dataset selection
(1) MNIST dataset: http://yann.lecun.com/exdb/
Note: you can download the sample data locally to improve the efficiency of program debugging. The final product can download the data remotely.
(2) Sample data and sample label format
(3) Source code example -- download and read data
#2-1 preparing data sets train_data = dataset.MNIST(root = "mnist", train = True, transform = transforms.ToTensor(), download = True) #2-1 preparing data sets test_data = dataset.MNIST(root = "mnist", train = False, transform = transforms.ToTensor(), download = True) print(train_data) print("size=", len(train_data)) print("") print(test_data) print("size=", len(test_data))
Dataset MNIST Number of datapoints: 60000 Root location: mnist Split: Train StandardTransform Transform: ToTensor() size= 60000 Dataset MNIST Number of datapoints: 10000 Root location: mnist Split: Test StandardTransform Transform: ToTensor() size= 10000
two point two Step 2-2: Data Preprocessing
(1) The original image is not superimposed with noise display
#The original image does not superimpose noise #Get a picture data print("Original picture") image, label = train_data[0] print("torch image shape:", image.shape) print("torch image label:", label) print("\n Single channel original image: numpy") image = image.numpy().transpose(1,2,0) print("numpy image shape:", image.shape) print("numpy image label:", label) print("\n No superimposed noise, Original display") plt.imshow(image) plt.show()
Original picture torch image shape: torch.Size([1, 28, 28]) torch image label: 5 Single channel original image: numpy numpy image shape: (28, 28, 1) numpy image label: 5 No noise is superimposed, and the original image is displayed
(2) Original image superimposed noise
#Original image superimposed noise #Get a picture data print("Original picture") image, label = train_data[0] print("torch image shape:", image.shape) print("torch image label:", label) print("\n Single channel original image: numpy") image = image.numpy().transpose(1,2,0) print("numpy image shape:", image.shape) print("numpy image label:", label) print("\n Superimposed noise, Smooth display") std = [0.5] mean = [0.5] image = image * std + mean plt.imshow(image) plt.show()
Original picture torch image shape: torch.Size([1, 28, 28]) torch image label: 5 Single channel original image: numpy numpy image shape: (28, 28, 1) numpy image label: 5 Superimposed noise, smooth display
(3) # superimposed noise, grayscale display picture
#Superimposed noise, grayscale display picture print("Original picture") image, label = train_data[0] print("torch image shape:", image.shape) print("torch image label:", label) print("\n Three channel gray image: torch") image = utils.make_grid(image) print("torch image shape:", image.shape) print("torch image label:", label) print("\n Three channel gray image: numpy") image = image.numpy().transpose(1,2,0) print("numpy image shape:", image.shape) print("numpy image label:", label) print("\n Superimposed noise, Smooth display") std = [0.5] mean = [0.5] image = image * std + mean plt.imshow(image) plt.show()
Original picture torch image shape: torch.Size([1, 28, 28]) torch image label: 5 Three channel gray image: torch torch image shape: torch.Size([3, 28, 28]) torch image label: 5 Three channel gray image: numpy numpy image shape: (28, 28, 3) numpy image label: 5 Superimposed noise, smooth display
(4) # no noise superimposed, black and white display picture
#No noise superimposed, black and white display picture print("Original picture") image, label = train_data[0] print("torch image shape:", image.shape) print("torch image label:", label) print("\n Three channel gray image: torch") image = utils.make_grid(image) print("torch image shape:", image.shape) print("torch image label:", label) print("\n Three channel gray image: numpy") image = image.numpy().transpose(1,2,0) print("numpy image shape:", image.shape) print("numpy image label:", label) print("\n No noise superimposed, black and white display") plt.imshow(image) plt.show() print("numpy image shape:", image.shape)
Original picture torch image shape: torch.Size([1, 28, 28]) torch image label: 5 Three channel gray image: torch torch image shape: torch.Size([3, 28, 28]) torch image label: 5 Three channel gray image: numpy numpy image shape: (28, 28, 3) numpy image label: 5 No noise superimposed, black and white display
(5) Batch data reading
# Batch data reading train_loader = data_utils.DataLoader(dataset = train_data, batch_size = 64, shuffle = True) test_loader = data_utils.DataLoader(dataset = test_data, batch_size = 64, shuffle = True) print(train_loader) print(test_loader) print(len(train_loader), len(train_data)/64) print(len(test_loader), len(test_data)/64)
<torch.utils.data.dataloader.DataLoader object at 0x000002461EF4A1C0> <torch.utils.data.dataloader.DataLoader object at 0x000002461ED66610> 938 937.5 157 156.25
(6) # display a batch picture
Show a batch picture print("Get a batch Group picture") imgs, labels = next(iter(train_loader)) print(imgs.shape) print(labels.shape) print(labels.size()[0]) print("\n Merge into a three channel gray image") images = utils.make_grid(imgs) print(images.shape) print(labels.shape) print("\n convert to imshow format") images = images.numpy().transpose(1,2,0) print(images.shape) print(labels.shape) print("\n Show sample label") #Print picture labels for i in range(64): print(labels[i], end=" ") i += 1 #Line feed if i%8 == 0: print(end='\n') print("\n display picture") plt.imshow(images) plt.show()
Get a batch group picture torch.Size([64, 1, 28, 28]) torch.Size([64]) 64 Merge into a three channel gray image torch.Size([3, 242, 242]) torch.Size([64]) Convert to imshow format (242, 242, 3) torch.Size([64]) Show sample label tensor(0) tensor(8) tensor(3) tensor(7) tensor(5) tensor(7) tensor(9) tensor(7) tensor(1) tensor(1) tensor(1) tensor(8) tensor(8) tensor(6) tensor(0) tensor(1) tensor(4) tensor(8) tensor(1) tensor(3) tensor(3) tensor(6) tensor(4) tensor(4) tensor(0) tensor(5) tensor(8) tensor(5) tensor(9) tensor(3) tensor(7) tensor(5) tensor(2) tensor(1) tensor(0) tensor(6) tensor(8) tensor(8) tensor(9) tensor(6) tensor(1) tensor(3) tensor(5) tensor(3) tensor(4) tensor(4) tensor(3) tensor(1) tensor(4) tensor(1) tensor(4) tensor(4) tensor(9) tensor(8) tensor(7) tensor(2) tensor(3) tensor(1) tensor(2) tensor(0) tensor(8) tensor(1) tensor(1) tensor(4) display picture
2.3 step 2-3: neural network modeling
(1) Single layer neural network + softmax
# 2-3 definition of network model: single layer neural network class NetA(torch.nn.Module): # Defining neural networks def __init__(self, n_feature,n_output): super(NetA, self).__init__() self.fc1 = nn.Linear(n_feature, n_output) self.softmax = nn.Softmax(dim=1) #Define forward operation def forward(self, x): # The resulting data format torch.Size([64, 1, 28, 28]) needs to be converted to (64784) x = x.view(x.size()[0],-1) # -1 indicates automatic matching fc1 = self.fc1(x) out = self.softmax(fc1) return out model_a = NetA(28*28, 10) print(model_a) print(model_a.parameters) print(model_a.parameters())
NetA( (fc1): Linear(in_features=784, out_features=10, bias=True) (softmax): Softmax(dim=1) ) <bound method Module.parameters of NetA( (fc1): Linear(in_features=784, out_features=10, bias=True) (softmax): Softmax(dim=1) )> <generator object Module.parameters at 0x000002461EDD8900>
(2) Two layer fully connected neural network
# 2-3 definition of network model: two-layer fully connected neural network class NetB(torch.nn.Module): # Defining neural networks def __init__(self, n_feature, n_hidden, n_output): super(NetB, self).__init__() self.fc1 = nn.Linear(n_feature, n_hidden) self.fc2 = nn.Linear(n_hidden, n_output) self.softmax = nn.Softmax(dim=1) #Define forward operation def forward(self, x): # The resulting data format torch.Size([64, 1, 28, 28]) needs to be converted to (64784) x = x.view(x.size()[0],-1) # -1 indicates automatic matching fc1 = self.fc1(x) fc2 = self.fc2(fc1) out = self.softmax(fc2) return out model_b = NetB(28*28, 32, 10) print(model_b) print(model_b.parameters) print(model_b.parameters())
NetB( (fc1): Linear(in_features=784, out_features=32, bias=True) (fc2): Linear(in_features=32, out_features=10, bias=True) (softmax): Softmax(dim=1) ) <bound method Module.parameters of NetB( (fc1): Linear(in_features=784, out_features=32, bias=True) (fc2): Linear(in_features=32, out_features=10, bias=True) (softmax): Softmax(dim=1) )> <generator object Module.parameters at 0x000002461EDD8190>
(3) Two layer fully connected neural network with relu
# 2-3 definition of network model: two-layer fully connected neural network with relu class NetC(torch.nn.Module): # Defining neural networks def __init__(self, n_feature, n_hidden, n_output): super(NetC, self).__init__() self.fc1 = nn.Linear(n_feature, n_hidden) self.relu1 = torch.relu self.fc2 = nn.Linear(n_hidden, n_output) self.softmax = nn.Softmax(dim=1) #Define forward operation def forward(self, x): # The resulting data format torch.Size([64, 1, 28, 28]) needs to be converted to (64784) x = x.view(x.size()[0],-1) # -1 indicates automatic matching fc1 = self.fc1(x) a1 = self.relu1(fc1) fc2 = self.fc2(a1) out = self.softmax(fc2) return out model_c = NetC(28*28, 32, 10) print(model_c) print(model_c.parameters) print(model_c.parameters())
NetC( (fc1): Linear(in_features=784, out_features=32, bias=True) (fc2): Linear(in_features=32, out_features=10, bias=True) (softmax): Softmax(dim=1) ) <bound method Module.parameters of NetC( (fc1): Linear(in_features=784, out_features=32, bias=True) (fc2): Linear(in_features=32, out_features=10, bias=True) (softmax): Softmax(dim=1) )> <generator object Module.parameters at 0x000002461F0570B0>
(4) Convolutional neural network
# 2-3 definition of network model: convolutional neural network class CNN(nn.Module): def __init__(self): super(CNN,self).__init__() self.conv1 = nn.Conv2d(1,32,kernel_size=3,stride=1,padding=1) self.pool = nn.MaxPool2d(2,2) self.conv2 = nn.Conv2d(32,64,kernel_size=3,stride=1,padding=1) self.fc1 = nn.Linear(64*7*7,1024)#Two pools, so it's 7 * 7 instead of 14 * 14 self.fc2 = nn.Linear(1024,512) self.fc3 = nn.Linear(512,10) # self.dp = nn.Dropout(p=0.5) def forward(self,x): x = self.pool(F.relu(self.conv1(x))) x = self.pool(F.relu(self.conv2(x))) x = x.view(-1, 64 * 7* 7)#Flatten the data into one-dimensional x = F.relu(self.fc1(x)) # x = self.fc3(x) # self.dp(x) x = F.relu(self.fc2(x)) x = self.fc3(x) # x = F.log_softmax(x,dim=1) NLLLoss(), not cross entropy return x model_d = CNN() print(model_d) print(model_d.parameters) print(model_d.parameters())
CNN( (conv1): Conv2d(1, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (pool): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False) (conv2): Conv2d(32, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (fc1): Linear(in_features=3136, out_features=1024, bias=True) (fc2): Linear(in_features=1024, out_features=512, bias=True) (fc3): Linear(in_features=512, out_features=10, bias=True) ) <bound method Module.parameters of CNN( (conv1): Conv2d(1, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (pool): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False) (conv2): Conv2d(32, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (fc1): Linear(in_features=3136, out_features=1024, bias=True) (fc2): Linear(in_features=1024, out_features=512, bias=True) (fc3): Linear(in_features=512, out_features=10, bias=True) )> <generator object Module.parameters at 0x000002461F0570B0>
2.4 steps 2-4: neural network output
# 2-4 define network prediction output #y_pred = model.forward(x_train) #print(y_pred.shape)
Since it is batch data input, output cannot be given here
Chapter 3 definition of backward operation model
3.1 step 3-1: define the loss function
The MSE loss function used here
# 3-1 define the loss function: # loss_fn= MSE loss loss_fn = nn.MSELoss() print(loss_fn)
MSELoss()
three point two Step 3-2: define the optimizer
# 3-2 defining the optimizer model = model_a Learning_rate = 0.01 #Learning rate # optimizer = SGD: basic gradient descent method # Parameters: indicates the list of parameters to be optimized # lr: indicates the learning rate #optimizer = torch.optim.Adam(model.parameters(), lr = Learning_rate) optimizer = torch.optim.SGD(model.parameters(), lr = Learning_rate,momentum=0.9) print(optimizer)
SGD ( Parameter Group 0 dampening: 0 lr: 0.01 momentum: 0.9 nesterov: False weight_decay: 0 ) (1) Select the training model, and here select the simplest model: single-layer multi input, multi output fully connected neural network
(2) Set momentum: momentum=0.9
3.3 step 3-3: model training
Due to the large amount of training data, batch reading is required. One batch of sample data is read each time. The number of batches is set when setting the batch reader loader.
During training, the batch reader reads data in turn for training. The subsequent training process and results may affect the previous training results. Therefore, multiple rounds of training are required. In the following code, it is reflected in epochs and two rounds of cycle.
# 3-3 model training # Define the number of iterations epochs = 3 loss_history = [] #loss data during training accuracy_history =[] #Intermediate forecast results accuracy_batch = 0.0 for i in range(0, epochs): for j, (x_train, y_train) in enumerate(train_loader): #(0) reset the gradient of the optimizer optimizer.zero_grad() #(1) Forward calculation y_pred = model(x_train) #(2) Calculate loss loss = loss_fn(y_pred, y_train) #(3) Reverse derivation loss.backward() #(4) Reverse iteration optimizer.step() # Record the loss value during training loss_history.append(loss.item()) #loss for a batch # Record the accuracy during training number_batch = y_train.size()[0] # Number of pictures _, predicted = torch.max(y_pred.data, dim = 1) correct_batch = (predicted == y_train).sum().item() # Predict the correct number accuracy_batch = 100 * correct_batch/number_batch accuracy_history.append(accuracy_batch) if(j % 100 == 0): print('epoch {} batch {} In {} loss = {:.4f} accuracy = {:.4f}%%'.format(i, j , len(train_data)/64, loss.item(), accuracy_batch)) print("\n Iteration completion") print("final loss =", loss.item()) print("final accu =", accuracy_batch)
poch 0 batch 0 In 937.5 loss = 2.3058 accuracy = 6.2500%% epoch 0 batch 100 In 937.5 loss = 2.0725 accuracy = 57.8125%% epoch 0 batch 200 In 937.5 loss = 1.9046 accuracy = 68.7500%% epoch 0 batch 300 In 937.5 loss = 1.7814 accuracy = 75.0000%% epoch 0 batch 400 In 937.5 loss = 1.7647 accuracy = 78.1250%% epoch 0 batch 500 In 937.5 loss = 1.7280 accuracy = 84.3750%% epoch 0 batch 600 In 937.5 loss = 1.7284 accuracy = 79.6875%% epoch 0 batch 700 In 937.5 loss = 1.7081 accuracy = 82.8125%% epoch 0 batch 800 In 937.5 loss = 1.6773 accuracy = 85.9375%% epoch 0 batch 900 In 937.5 loss = 1.6886 accuracy = 85.9375%% epoch 1 batch 0 In 937.5 loss = 1.6671 accuracy = 82.8125%% epoch 1 batch 100 In 937.5 loss = 1.6914 accuracy = 81.2500%% epoch 1 batch 200 In 937.5 loss = 1.7119 accuracy = 78.1250%% epoch 1 batch 300 In 937.5 loss = 1.6585 accuracy = 87.5000%% epoch 1 batch 400 In 937.5 loss = 1.6913 accuracy = 81.2500%% epoch 1 batch 500 In 937.5 loss = 1.6074 accuracy = 90.6250%% epoch 1 batch 600 In 937.5 loss = 1.6062 accuracy = 90.6250%% epoch 1 batch 700 In 937.5 loss = 1.6187 accuracy = 90.6250%% epoch 1 batch 800 In 937.5 loss = 1.6249 accuracy = 90.6250%% epoch 1 batch 900 In 937.5 loss = 1.6138 accuracy = 89.0625%% epoch 2 batch 0 In 937.5 loss = 1.6205 accuracy = 90.6250%% epoch 2 batch 100 In 937.5 loss = 1.5862 accuracy = 95.3125%% epoch 2 batch 200 In 937.5 loss = 1.6430 accuracy = 84.3750%% epoch 2 batch 300 In 937.5 loss = 1.5834 accuracy = 90.6250%% epoch 2 batch 400 In 937.5 loss = 1.5672 accuracy = 95.3125%% epoch 2 batch 500 In 937.5 loss = 1.5965 accuracy = 92.1875%% epoch 2 batch 600 In 937.5 loss = 1.6430 accuracy = 87.5000%% epoch 2 batch 700 In 937.5 loss = 1.5538 accuracy = 98.4375%% epoch 2 batch 800 In 937.5 loss = 1.5700 accuracy = 92.1875%% epoch 2 batch 900 In 937.5 loss = 1.6196 accuracy = 89.0625%% Iteration completion final loss = 1.6274147033691406 final accu = 87.5
three point four Step 3-4: Model Visualization
(1) Forward operation data
(2) Backward loss iterative process
#Display historical data of loss plt.grid() plt.xlabel("iters") plt.ylabel("") plt.title("loss", fontsize = 12) plt.plot(loss_history, "r") plt.show()
(3) Accuracy of online training
#Display historical data of accuracy plt.grid() plt.xlabel("iters") plt.ylabel("%") plt.title("accuracy", fontsize = 12) plt.plot(accuracy_history, "b+") plt.show()
three point five Step 3-5: model validation
(1) Manual single batch verification
# Manual inspection index = 0 print("Get a batch sample") images, labels = next(iter(test_loader)) print(images.shape) print(labels.shape) print(labels) print("\n yes batch Forecast all samples in") outputs = model(images) print(outputs.data.shape) print("\n yes batch Select the most likely classification according to the prediction results of each sample") _, predicted = torch.max(outputs, 1) print(predicted.data.shape) print(predicted) print("\n yes batch All results in are compared") bool_results = (predicted == labels) print(bool_results.shape) print(bool_results) print("\n Count and predict the number and accuracy of correct samples") corrects = bool_results.sum().item() accuracy = corrects/(len(bool_results)) print("corrects=", corrects) print("accuracy=", accuracy) print("\n sample index =", index) print("Tag value : ", labels[index]. item()) print("Classification possibility:", outputs.data[index].numpy()) print("Maximum likelihood:",predicted.data[index].item()) print("Correctness : ",bool_results.data[index].item())
Get a batch sample torch.Size([64, 1, 28, 28]) torch.Size([64]) tensor([1, 2, 4, 3, 7, 7, 4, 0, 9, 1, 2, 0, 4, 3, 5, 2, 9, 3, 6, 3, 0, 1, 5, 5, 1, 5, 6, 8, 1, 9, 5, 0, 2, 3, 2, 4, 7, 4, 7, 9, 7, 5, 0, 2, 8, 8, 5, 9, 3, 6, 4, 9, 9, 3, 5, 1, 1, 2, 4, 0, 7, 5, 3, 7]) Predict all samples in batch torch.Size([64, 10]) Select the most likely classification for the prediction results of each sample in batch torch.Size([64]) tensor([1, 2, 4, 3, 7, 4, 4, 0, 4, 1, 2, 0, 4, 3, 4, 2, 9, 3, 3, 3, 0, 1, 5, 5, 1, 5, 6, 8, 1, 9, 5, 0, 2, 3, 2, 4, 7, 4, 7, 9, 7, 5, 0, 2, 8, 8, 9, 5, 3, 6, 4, 9, 9, 3, 5, 1, 1, 0, 4, 0, 7, 5, 3, 7]) Compare all results in batch torch.Size([64]) tensor([ True, True, True, True, True, False, True, True, False, True, True, True, True, True, False, True, True, True, False, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, False, False, True, True, True, True, True, True, True, True, True, False, True, True, True, True, True, True]) Count and predict the number and accuracy of correct samples corrects= 57 accuracy= 0.890625 Sample index = 0 Tag value: 1 Classification possibility: [7.4082727e-06 9.9425703e-01 2.3936583e-03 5.4639770e-04 1.2618493e-05 5.9957332e-05 3.4420833e-04 1.0612858e-04 2.1460787e-03 1.2669782e-04] Maximum likelihood: 1 Correctness: True
(2) Automatic verification on training set
# Evaluate the model and test its accuracy on the training set correct_dataset = 0 total_dataset = 0 accuracy_dataset = 0.0 # The network does not update the gradient during evaluation with torch.no_grad(): for i, data in enumerate(train_loader): #Get a batch sample“ images, labels = data #Predict all samples in batch outputs = model(images) #Select the most likely classification for the prediction results of each sample in batch _, predicted = torch.max(outputs.data, 1) #Accumulate the number of samples in batch total_dataset += labels.size()[0] #Compare all results in batch“ bool_results = (predicted == labels) #Count the number of correctly predicted samples correct_dataset += bool_results.sum().item() #Accuracy of statistical prediction of correct samples accuracy_dataset = 100 * correct_dataset/total_dataset if(i % 100 == 0): print('batch {} In {} accuracy = {:.4f}'.format(i, len(train_data)/64, accuracy_dataset)) print('Final result with the model on the dataset, accuracy =', accuracy_dataset)
batch 0 In 937.5 accuracy = 90.6250 batch 100 In 937.5 accuracy = 90.0371 batch 200 In 937.5 accuracy = 89.8554 batch 300 In 937.5 accuracy = 89.4985 batch 400 In 937.5 accuracy = 89.2846 batch 500 In 937.5 accuracy = 89.2340 batch 600 In 937.5 accuracy = 89.1691 batch 700 In 937.5 accuracy = 89.1049 batch 800 In 937.5 accuracy = 89.2069 batch 900 In 937.5 accuracy = 89.2602 Final result with the model on the dataset, accuracy = 89.22
(3) Validation on test set
# Evaluate the model and test its accuracy on the training set correct_dataset = 0 total_dataset = 0 accuracy_dataset = 0.0 # The network does not update the gradient during evaluation with torch.no_grad(): for i, data in enumerate(test_loader): #Get a batch sample“ images, labels = data #Predict all samples in batch outputs = model(images) #Select the most likely classification for the prediction results of each sample in batch _, predicted = torch.max(outputs.data, 1) #Accumulate the number of samples in batch total_dataset += labels.size()[0] #Compare all results in batch“ bool_results = (predicted == labels) #Count the number of correctly predicted samples correct_dataset += bool_results.sum().item() #Accuracy of statistical prediction of correct samples accuracy_dataset = 100 * correct_dataset/total_dataset if(i % 100 == 0): print('batch {} In {} accuracy = {:.4f}'.format(i, len(test_data)/64, accuracy_dataset)) print('Final result with the model on the dataset, accuracy =', accuracy_dataset)
batch 0 In 156.25 accuracy = 89.0625 batch 100 In 156.25 accuracy = 90.5631 Final result with the model on the dataset, accuracy = 89.93
remarks:
As can be seen from the figure above, the accuracy of a simple fully connected network is about 90%.
If more accurate training is needed, convolutional neural network needs to be used.
Chapter 4 model deployment
4.1 step 4-1: model storage
#Storage model torch.save(model, "models/boston_net.pkl") #Storage parameters torch.save(model.state_dict() , "models/boston_params.pkl")
four point two Step 4-2: model loading
Author home page( Silicon based workshop of slow fire rock sugar): Slow fire rock sugar (Wang Wenbing) blog silicon based workshop of slow fire rock sugar _csdnblog
Website of this article: https://blog.csdn.net/HiWangWenBing/article/details/120607797