[pytoch series - 29]: neural network foundation - fully connected shallow neural network to realize 10 classification handwritten numeral recognition

Author home page( Silicon based workshop of slow fire rock sugar): Slow fire rock sugar (Wang Wenbing) blog silicon based workshop of slow fire rock sugar _csdnblog

  Website of this article: https://blog.csdn.net/HiWangWenBing/article/details/120607797

catalogue

Introduction deep learning model framework

Chapter 1 business area analysis

one point one   Step 1-1: business domain analysis

1.2 steps 1-2: Business Modeling

1.3 training model

1.4 validation model

1.5 overall structure

one point six   Code instance preconditions

Chapter 2 definition of forward operation model

two point one   Step 2-1: dataset selection

two point two   Step 2-2: Data Preprocessing

2.3 step 2-3: neural network modeling

2.4 steps 2-4: neural network output

Chapter 3 definition of backward operation model

3.1 step 3-1: define the loss function

three point two   Step 3-2: define the optimizer

3.3 step 3-3: model training

three point four   Step 3-4: Model Visualization

3.5 step 3-5: model validation

Chapter 4 model deployment

4.1 step 4-1: model storage

four point two   Step 4-2: model loading

Introduction deep learning model framework

[artificial intelligence - deep learning - 8]: neural network foundation - machine learning, deep learning model, model training Wenhuo Bingtang (Wang Wenbing) blog - CSDN blog: neural network and deep learning Chapter 1 vernacular machine learning [artificial intelligence - Overview - 4]: vernacular deep learning - no basic white can understand the core concept of machine learning Wenhuo Bingtang (Wang Wenbing) Blog - CSDN blog [artificial intelligence - deep learning - 7]: neural network foundation - artificial neural network ANN_: blog of Wenhuo Bingtang (Wang Wenbing) - CSDN blog Chapter 2 model and steps of machine learning 2.1 deep learning and machine learning among the above three concepts: the concept of artificial intelligence is the most extensive, so there are technologies and non technologies (such as ethics) that can make machines have the same intelligence as "people" An important means for machines to obtain "intelligence" is that machines have the ability of "self-learning"https://blog.csdn.net/HiWangWenBing/article/details/120462734

Chapter 1 business area analysis

one point one   Step 1-1: business domain analysis

(1) Business requirements

Objective requirements:

Given any handwritten numeral graph, which numeral does it belong to?

(2) Business analysis

The essence of this task is the multi classification in logical classification, and the 10 classification problem in multi classification, that is, given the characteristic data of a graph (here is the single channel pixel value of a single graph), we can judge which digital classification it belongs to.

For this case, the shallow fully connected network is selected, and the deep neural network is not required.

1.2 steps 1-2: Business Modeling

(1) Single layer neural network

Input: 784

Output: 10

(2) Two layer neural network

First layer (hidden layer): 748 * m + m parameters;   784 = 32 * 32

Second layer (output layer): m * 10 + 10 parameters

When m = 256, common parameters:

First layer (hidden layer): 191744

Layer 2 (output layer):   2,570

Total ~ = 200000 parameters

(3) Three layer neural network

  A three-layer neural network with two hidden layers is defined

L0 (input layer): X: 784   = 32 * 32

L1 (hidden layer 1): L1 = X*W1   + B1 : 784 * 256 + 256 => 200,960

L2 (hidden layer 2): L2 = L1 * W2 + B2: 256 * 64   + 64 => 16,448

L3 (output layer): L3 = L2 * W3 + B3: 64 * 10 + 10 = > 650

About 220000 parameters in total

(4) Selection of activation function

1.3 training model

1.4 validation model

1.5 overall structure

one point six   Code instance preconditions

#Environmental preparation
import numpy as np              # numpy array library
import math                     # Mathematical operation Library
import matplotlib.pyplot as plt # Drawing library

import torch             # torch base library
import torch.nn as nn    #  torch neural network library
import torch.nn.functional as F    #  torch neural network library
from sklearn.datasets import load_boston
from sklearn.preprocessing import MinMaxScaler
from sklearn.model_selection import train_test_split

print("Hello World")
print(torch.__version__)
print(torch.cuda.is_available())
Hello World
1.8.0
False

Chapter 2 definition of forward operation model

two point one   Step 2-1: dataset selection

(1) MNIST dataset:   http://yann.lecun.com/exdb/

  Note: you can download the sample data locally to improve the efficiency of program debugging. The final product can download the data remotely.

(2) Sample data and sample label format

(3) Source code example -- download and read data

#2-1 preparing data sets
train_data = dataset.MNIST(root = "mnist",
                           train = True,
                           transform = transforms.ToTensor(),
                           download = True)

#2-1 preparing data sets
test_data = dataset.MNIST(root = "mnist",
                           train = False,
                           transform = transforms.ToTensor(),
                           download = True)

print(train_data)
print("size=", len(train_data))
print("")
print(test_data)
print("size=", len(test_data))
Dataset MNIST
    Number of datapoints: 60000
    Root location: mnist
    Split: Train
    StandardTransform
Transform: ToTensor()
size= 60000

Dataset MNIST
    Number of datapoints: 10000
    Root location: mnist
    Split: Test
    StandardTransform
Transform: ToTensor()
size= 10000

two point two   Step 2-2: Data Preprocessing

(1) The original image is not superimposed with noise display

#The original image does not superimpose noise
#Get a picture data
print("Original picture")
image, label = train_data[0]
print("torch image shape:", image.shape)
print("torch image label:", label)

print("\n Single channel original image: numpy")
image = image.numpy().transpose(1,2,0) 
print("numpy image shape:", image.shape)
print("numpy image label:", label)

print("\n No superimposed noise, Original display")

plt.imshow(image)
plt.show()
Original picture
torch image shape: torch.Size([1, 28, 28])
torch image label: 5

Single channel original image: numpy
numpy image shape: (28, 28, 1)
numpy image label: 5

No noise is superimposed, and the original image is displayed

(2) Original image superimposed noise

#Original image superimposed noise
#Get a picture data
print("Original picture")
image, label = train_data[0]
print("torch image shape:", image.shape)
print("torch image label:", label)

print("\n Single channel original image: numpy")
image = image.numpy().transpose(1,2,0) 
print("numpy image shape:", image.shape)
print("numpy image label:", label)

print("\n Superimposed noise, Smooth display")
std = [0.5]
mean = [0.5]
image = image * std + mean

plt.imshow(image)
plt.show()
Original picture
torch image shape: torch.Size([1, 28, 28])
torch image label: 5

Single channel original image: numpy
numpy image shape: (28, 28, 1)
numpy image label: 5

Superimposed noise, smooth display

 

(3) # superimposed noise, grayscale display picture

#Superimposed noise, grayscale display picture
print("Original picture")
image, label = train_data[0]
print("torch image shape:", image.shape)
print("torch image label:", label)

print("\n Three channel gray image: torch")
image = utils.make_grid(image)
print("torch image shape:", image.shape)
print("torch image label:", label)

print("\n Three channel gray image: numpy")
image = image.numpy().transpose(1,2,0) 
print("numpy image shape:", image.shape)
print("numpy image label:", label)

print("\n Superimposed noise, Smooth display")
std = [0.5]
mean = [0.5]
image = image * std + mean

plt.imshow(image)
plt.show()
Original picture
torch image shape: torch.Size([1, 28, 28])
torch image label: 5

Three channel gray image: torch
torch image shape: torch.Size([3, 28, 28])
torch image label: 5

Three channel gray image: numpy
numpy image shape: (28, 28, 3)
numpy image label: 5

Superimposed noise, smooth display

(4) # no noise superimposed, black and white display picture

#No noise superimposed, black and white display picture
print("Original picture")
image, label = train_data[0]
print("torch image shape:", image.shape)
print("torch image label:", label)

print("\n Three channel gray image: torch")
image = utils.make_grid(image)
print("torch image shape:", image.shape)
print("torch image label:", label)

print("\n Three channel gray image: numpy")
image = image.numpy().transpose(1,2,0) 
print("numpy image shape:", image.shape)
print("numpy image label:", label)

print("\n No noise superimposed, black and white display")
plt.imshow(image)
plt.show()
print("numpy image shape:", image.shape)
Original picture
torch image shape: torch.Size([1, 28, 28])
torch image label: 5

Three channel gray image: torch
torch image shape: torch.Size([3, 28, 28])
torch image label: 5

Three channel gray image: numpy
numpy image shape: (28, 28, 3)
numpy image label: 5

No noise superimposed, black and white display

(5) Batch data reading

# Batch data reading
train_loader = data_utils.DataLoader(dataset = train_data,
                                  batch_size = 64,
                                  shuffle = True)

test_loader = data_utils.DataLoader(dataset = test_data,
                                  batch_size = 64,
                                  shuffle = True)

print(train_loader)
print(test_loader)
print(len(train_loader), len(train_data)/64)
print(len(test_loader),  len(test_data)/64)
<torch.utils.data.dataloader.DataLoader object at 0x000002461EF4A1C0>
<torch.utils.data.dataloader.DataLoader object at 0x000002461ED66610>
938 937.5
157 156.25

(6) # display a batch picture

Show a batch picture
print("Get a batch Group picture")
imgs, labels = next(iter(train_loader))
print(imgs.shape)
print(labels.shape)
print(labels.size()[0])

print("\n Merge into a three channel gray image")
images = utils.make_grid(imgs)
print(images.shape)
print(labels.shape)

print("\n convert to imshow format")
images = images.numpy().transpose(1,2,0) 
print(images.shape)
print(labels.shape)

print("\n Show sample label")
#Print picture labels
for i in range(64):
    print(labels[i], end=" ")
    i += 1
    #Line feed
    if i%8 == 0:
        print(end='\n')

print("\n display picture")
plt.imshow(images)
plt.show()
Get a batch group picture
torch.Size([64, 1, 28, 28])
torch.Size([64])
64

Merge into a three channel gray image
torch.Size([3, 242, 242])
torch.Size([64])

Convert to imshow format
(242, 242, 3)
torch.Size([64])

Show sample label
tensor(0) tensor(8) tensor(3) tensor(7) tensor(5) tensor(7) tensor(9) tensor(7) 
tensor(1) tensor(1) tensor(1) tensor(8) tensor(8) tensor(6) tensor(0) tensor(1) 
tensor(4) tensor(8) tensor(1) tensor(3) tensor(3) tensor(6) tensor(4) tensor(4) 
tensor(0) tensor(5) tensor(8) tensor(5) tensor(9) tensor(3) tensor(7) tensor(5) 
tensor(2) tensor(1) tensor(0) tensor(6) tensor(8) tensor(8) tensor(9) tensor(6) 
tensor(1) tensor(3) tensor(5) tensor(3) tensor(4) tensor(4) tensor(3) tensor(1) 
tensor(4) tensor(1) tensor(4) tensor(4) tensor(9) tensor(8) tensor(7) tensor(2) 
tensor(3) tensor(1) tensor(2) tensor(0) tensor(8) tensor(1) tensor(1) tensor(4) 

display picture

2.3 step 2-3: neural network modeling

(1) Single layer neural network + softmax

# 2-3 definition of network model: single layer neural network
class NetA(torch.nn.Module):
    # Defining neural networks
    def __init__(self, n_feature,n_output):
        super(NetA, self).__init__()
        self.fc1 = nn.Linear(n_feature, n_output)
        self.softmax = nn.Softmax(dim=1)
        
    #Define forward operation
    def forward(self, x):
        # The resulting data format torch.Size([64, 1, 28, 28]) needs to be converted to (64784)
        x = x.view(x.size()[0],-1) # -1 indicates automatic matching
        fc1 = self.fc1(x)
        out = self.softmax(fc1)
        return out

model_a = NetA(28*28, 10)
print(model_a)
print(model_a.parameters)
print(model_a.parameters())
NetA(
  (fc1): Linear(in_features=784, out_features=10, bias=True)
  (softmax): Softmax(dim=1)
)
<bound method Module.parameters of NetA(
  (fc1): Linear(in_features=784, out_features=10, bias=True)
  (softmax): Softmax(dim=1)
)>
<generator object Module.parameters at 0x000002461EDD8900>

(2) Two layer fully connected neural network

# 2-3 definition of network model: two-layer fully connected neural network
class NetB(torch.nn.Module):
    # Defining neural networks
    def __init__(self, n_feature, n_hidden, n_output):
        super(NetB, self).__init__()
        self.fc1 = nn.Linear(n_feature, n_hidden)
        self.fc2 = nn.Linear(n_hidden, n_output)
        self.softmax = nn.Softmax(dim=1)
        
    #Define forward operation
    def forward(self, x):
        # The resulting data format torch.Size([64, 1, 28, 28]) needs to be converted to (64784)
        x = x.view(x.size()[0],-1) # -1 indicates automatic matching
        fc1 = self.fc1(x)
        fc2 = self.fc2(fc1)
        out = self.softmax(fc2)
        return out

model_b = NetB(28*28, 32, 10)
print(model_b)
print(model_b.parameters)
print(model_b.parameters())
NetB(
  (fc1): Linear(in_features=784, out_features=32, bias=True)
  (fc2): Linear(in_features=32, out_features=10, bias=True)
  (softmax): Softmax(dim=1)
)
<bound method Module.parameters of NetB(
  (fc1): Linear(in_features=784, out_features=32, bias=True)
  (fc2): Linear(in_features=32, out_features=10, bias=True)
  (softmax): Softmax(dim=1)
)>
<generator object Module.parameters at 0x000002461EDD8190>

(3) Two layer fully connected neural network with relu

# 2-3 definition of network model: two-layer fully connected neural network with relu
class NetC(torch.nn.Module):
    # Defining neural networks
    def __init__(self, n_feature, n_hidden, n_output):
        super(NetC, self).__init__()
        self.fc1 = nn.Linear(n_feature, n_hidden)
        self.relu1 = torch.relu
        self.fc2 = nn.Linear(n_hidden, n_output)
        self.softmax = nn.Softmax(dim=1)
        
    #Define forward operation
    def forward(self, x):
        # The resulting data format torch.Size([64, 1, 28, 28]) needs to be converted to (64784)
        x = x.view(x.size()[0],-1) # -1 indicates automatic matching
        fc1 = self.fc1(x)
        a1 =  self.relu1(fc1)
        fc2 = self.fc2(a1)
        out = self.softmax(fc2)
        return out

model_c = NetC(28*28, 32, 10)
print(model_c)
print(model_c.parameters)
print(model_c.parameters())
NetC(
  (fc1): Linear(in_features=784, out_features=32, bias=True)
  (fc2): Linear(in_features=32, out_features=10, bias=True)
  (softmax): Softmax(dim=1)
)
<bound method Module.parameters of NetC(
  (fc1): Linear(in_features=784, out_features=32, bias=True)
  (fc2): Linear(in_features=32, out_features=10, bias=True)
  (softmax): Softmax(dim=1)
)>
<generator object Module.parameters at 0x000002461F0570B0>

(4) Convolutional neural network

# 2-3 definition of network model: convolutional neural network
class CNN(nn.Module):
    def __init__(self):
        super(CNN,self).__init__()
        self.conv1 = nn.Conv2d(1,32,kernel_size=3,stride=1,padding=1)
        self.pool = nn.MaxPool2d(2,2)
        self.conv2 = nn.Conv2d(32,64,kernel_size=3,stride=1,padding=1)
        self.fc1 = nn.Linear(64*7*7,1024)#Two pools, so it's 7 * 7 instead of 14 * 14
        self.fc2 = nn.Linear(1024,512)
        self.fc3 = nn.Linear(512,10)
        # self.dp = nn.Dropout(p=0.5)
    def forward(self,x):
        x = self.pool(F.relu(self.conv1(x)))
        x = self.pool(F.relu(self.conv2(x)))

        x = x.view(-1, 64 * 7* 7)#Flatten the data into one-dimensional 
        x = F.relu(self.fc1(x))
        # x = self.fc3(x)
        # self.dp(x)
        x = F.relu(self.fc2(x))   
        x = self.fc3(x)  
        # x = F.log_softmax(x,dim=1) NLLLoss(), not cross entropy
        return x

model_d = CNN()
print(model_d)
print(model_d.parameters)
print(model_d.parameters())
CNN(
  (conv1): Conv2d(1, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (pool): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  (conv2): Conv2d(32, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (fc1): Linear(in_features=3136, out_features=1024, bias=True)
  (fc2): Linear(in_features=1024, out_features=512, bias=True)
  (fc3): Linear(in_features=512, out_features=10, bias=True)
)
<bound method Module.parameters of CNN(
  (conv1): Conv2d(1, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (pool): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  (conv2): Conv2d(32, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (fc1): Linear(in_features=3136, out_features=1024, bias=True)
  (fc2): Linear(in_features=1024, out_features=512, bias=True)
  (fc3): Linear(in_features=512, out_features=10, bias=True)
)>
<generator object Module.parameters at 0x000002461F0570B0>

2.4 steps 2-4: neural network output

# 2-4 define network prediction output
#y_pred = model.forward(x_train)
#print(y_pred.shape)
Since it is batch data input, output cannot be given here

Chapter 3 definition of backward operation model

3.1 step 3-1: define the loss function

The MSE loss function used here

# 3-1 define the loss function: 
# loss_fn= MSE loss
loss_fn = nn.MSELoss()

print(loss_fn)
MSELoss()

three point two   Step 3-2: define the optimizer

# 3-2 defining the optimizer

model = model_a

Learning_rate = 0.01     #Learning rate

# optimizer = SGD: basic gradient descent method
# Parameters: indicates the list of parameters to be optimized
# lr: indicates the learning rate
#optimizer = torch.optim.Adam(model.parameters(), lr = Learning_rate)
optimizer = torch.optim.SGD(model.parameters(), lr = Learning_rate,momentum=0.9)
print(optimizer)
SGD (
Parameter Group 0
    dampening: 0
    lr: 0.01
    momentum: 0.9
    nesterov: False
    weight_decay: 0
)

(1) Select the training model, and here select the simplest model: single-layer multi input, multi output fully connected neural network

(2) Set momentum: momentum=0.9

3.3 step 3-3: model training

Due to the large amount of training data, batch reading is required. One batch of sample data is read each time. The number of batches is set when setting the batch reader loader.

During training, the batch reader reads data in turn for training. The subsequent training process and results may affect the previous training results. Therefore, multiple rounds of training are required. In the following code, it is reflected in epochs and two rounds of cycle.

# 3-3 model training
# Define the number of iterations
epochs = 3

loss_history = [] #loss data during training
accuracy_history =[] #Intermediate forecast results

accuracy_batch = 0.0

for i in range(0, epochs):
    for j, (x_train, y_train) in enumerate(train_loader):
            
        #(0) reset the gradient of the optimizer
        optimizer.zero_grad()    
        
        #(1) Forward calculation
        y_pred = model(x_train)
    
        #(2) Calculate loss
        loss = loss_fn(y_pred, y_train)
    
        #(3) Reverse derivation
        loss.backward()
    
        #(4) Reverse iteration
        optimizer.step()
    
        # Record the loss value during training
        loss_history.append(loss.item())  #loss for a batch
        
        # Record the accuracy during training
        number_batch = y_train.size()[0] # Number of pictures
        _, predicted = torch.max(y_pred.data, dim = 1)
        correct_batch = (predicted == y_train).sum().item() # Predict the correct number
        accuracy_batch = 100 * correct_batch/number_batch
        accuracy_history.append(accuracy_batch)
    
        if(j % 100 == 0):
            print('epoch {} batch {} In {} loss = {:.4f} accuracy = {:.4f}%%'.format(i, j , len(train_data)/64, loss.item(), accuracy_batch)) 

print("\n Iteration completion")
print("final loss =", loss.item())
print("final accu =", accuracy_batch)
poch 0 batch 0 In 937.5 loss = 2.3058 accuracy = 6.2500%%
epoch 0 batch 100 In 937.5 loss = 2.0725 accuracy = 57.8125%%
epoch 0 batch 200 In 937.5 loss = 1.9046 accuracy = 68.7500%%
epoch 0 batch 300 In 937.5 loss = 1.7814 accuracy = 75.0000%%
epoch 0 batch 400 In 937.5 loss = 1.7647 accuracy = 78.1250%%
epoch 0 batch 500 In 937.5 loss = 1.7280 accuracy = 84.3750%%
epoch 0 batch 600 In 937.5 loss = 1.7284 accuracy = 79.6875%%
epoch 0 batch 700 In 937.5 loss = 1.7081 accuracy = 82.8125%%
epoch 0 batch 800 In 937.5 loss = 1.6773 accuracy = 85.9375%%
epoch 0 batch 900 In 937.5 loss = 1.6886 accuracy = 85.9375%%
epoch 1 batch 0 In 937.5 loss = 1.6671 accuracy = 82.8125%%
epoch 1 batch 100 In 937.5 loss = 1.6914 accuracy = 81.2500%%
epoch 1 batch 200 In 937.5 loss = 1.7119 accuracy = 78.1250%%
epoch 1 batch 300 In 937.5 loss = 1.6585 accuracy = 87.5000%%
epoch 1 batch 400 In 937.5 loss = 1.6913 accuracy = 81.2500%%
epoch 1 batch 500 In 937.5 loss = 1.6074 accuracy = 90.6250%%
epoch 1 batch 600 In 937.5 loss = 1.6062 accuracy = 90.6250%%
epoch 1 batch 700 In 937.5 loss = 1.6187 accuracy = 90.6250%%
epoch 1 batch 800 In 937.5 loss = 1.6249 accuracy = 90.6250%%
epoch 1 batch 900 In 937.5 loss = 1.6138 accuracy = 89.0625%%
epoch 2 batch 0 In 937.5 loss = 1.6205 accuracy = 90.6250%%
epoch 2 batch 100 In 937.5 loss = 1.5862 accuracy = 95.3125%%
epoch 2 batch 200 In 937.5 loss = 1.6430 accuracy = 84.3750%%
epoch 2 batch 300 In 937.5 loss = 1.5834 accuracy = 90.6250%%
epoch 2 batch 400 In 937.5 loss = 1.5672 accuracy = 95.3125%%
epoch 2 batch 500 In 937.5 loss = 1.5965 accuracy = 92.1875%%
epoch 2 batch 600 In 937.5 loss = 1.6430 accuracy = 87.5000%%
epoch 2 batch 700 In 937.5 loss = 1.5538 accuracy = 98.4375%%
epoch 2 batch 800 In 937.5 loss = 1.5700 accuracy = 92.1875%%
epoch 2 batch 900 In 937.5 loss = 1.6196 accuracy = 89.0625%%

Iteration completion
final loss = 1.6274147033691406
final accu = 87.5

three point four   Step 3-4: Model Visualization

(1) Forward operation data

(2) Backward loss iterative process

#Display historical data of loss
plt.grid()
plt.xlabel("iters")
plt.ylabel("")
plt.title("loss", fontsize = 12)
plt.plot(loss_history, "r")
plt.show()

(3) Accuracy of online training

#Display historical data of accuracy
plt.grid()
plt.xlabel("iters")
plt.ylabel("%")
plt.title("accuracy", fontsize = 12)
plt.plot(accuracy_history, "b+")
plt.show()

three point five   Step 3-5: model validation

(1) Manual single batch verification

# Manual inspection
index = 0
print("Get a batch sample")
images, labels = next(iter(test_loader))
print(images.shape)
print(labels.shape)
print(labels)


print("\n yes batch Forecast all samples in")
outputs = model(images)
print(outputs.data.shape)

print("\n yes batch Select the most likely classification according to the prediction results of each sample")
_, predicted = torch.max(outputs, 1)
print(predicted.data.shape)
print(predicted)


print("\n yes batch All results in are compared")
bool_results = (predicted == labels)
print(bool_results.shape)
print(bool_results)

print("\n Count and predict the number and accuracy of correct samples")
corrects = bool_results.sum().item()
accuracy = corrects/(len(bool_results))
print("corrects=", corrects)
print("accuracy=", accuracy)

print("\n sample index =", index)
print("Tag value    : ", labels[index]. item())
print("Classification possibility:", outputs.data[index].numpy())
print("Maximum likelihood:",predicted.data[index].item())
print("Correctness    : ",bool_results.data[index].item())
Get a batch sample
torch.Size([64, 1, 28, 28])
torch.Size([64])
tensor([1, 2, 4, 3, 7, 7, 4, 0, 9, 1, 2, 0, 4, 3, 5, 2, 9, 3, 6, 3, 0, 1, 5, 5,
        1, 5, 6, 8, 1, 9, 5, 0, 2, 3, 2, 4, 7, 4, 7, 9, 7, 5, 0, 2, 8, 8, 5, 9,
        3, 6, 4, 9, 9, 3, 5, 1, 1, 2, 4, 0, 7, 5, 3, 7])

Predict all samples in batch
torch.Size([64, 10])

Select the most likely classification for the prediction results of each sample in batch
torch.Size([64])
tensor([1, 2, 4, 3, 7, 4, 4, 0, 4, 1, 2, 0, 4, 3, 4, 2, 9, 3, 3, 3, 0, 1, 5, 5,
        1, 5, 6, 8, 1, 9, 5, 0, 2, 3, 2, 4, 7, 4, 7, 9, 7, 5, 0, 2, 8, 8, 9, 5,
        3, 6, 4, 9, 9, 3, 5, 1, 1, 0, 4, 0, 7, 5, 3, 7])

Compare all results in batch
torch.Size([64])
tensor([ True,  True,  True,  True,  True, False,  True,  True, False,  True,
         True,  True,  True,  True, False,  True,  True,  True, False,  True,
         True,  True,  True,  True,  True,  True,  True,  True,  True,  True,
         True,  True,  True,  True,  True,  True,  True,  True,  True,  True,
         True,  True,  True,  True,  True,  True, False, False,  True,  True,
         True,  True,  True,  True,  True,  True,  True, False,  True,  True,
         True,  True,  True,  True])

Count and predict the number and accuracy of correct samples
corrects= 57
accuracy= 0.890625

Sample index = 0
 Tag value: 1
 Classification possibility: [7.4082727e-06 9.9425703e-01 2.3936583e-03 5.4639770e-04 1.2618493e-05
 5.9957332e-05 3.4420833e-04 1.0612858e-04 2.1460787e-03 1.2669782e-04]
Maximum likelihood: 1
 Correctness: True

(2) Automatic verification on training set

# Evaluate the model and test its accuracy on the training set
correct_dataset  = 0
total_dataset    = 0
accuracy_dataset = 0.0

# The network does not update the gradient during evaluation
with torch.no_grad():
    for i, data in enumerate(train_loader):
        #Get a batch sample“ 
        images, labels = data
        
        #Predict all samples in batch
        outputs = model(images)
        
        #Select the most likely classification for the prediction results of each sample in batch
        _, predicted = torch.max(outputs.data, 1)
        
        #Accumulate the number of samples in batch
        total_dataset += labels.size()[0] 
        
        #Compare all results in batch“
        bool_results = (predicted == labels)
        
        #Count the number of correctly predicted samples
        correct_dataset += bool_results.sum().item()
        
        #Accuracy of statistical prediction of correct samples
        accuracy_dataset = 100 * correct_dataset/total_dataset
        
        if(i % 100 == 0):
            print('batch {} In {} accuracy = {:.4f}'.format(i, len(train_data)/64, accuracy_dataset))
            
print('Final result with the model on the dataset, accuracy =', accuracy_dataset)
batch 0 In 937.5 accuracy = 90.6250
batch 100 In 937.5 accuracy = 90.0371
batch 200 In 937.5 accuracy = 89.8554
batch 300 In 937.5 accuracy = 89.4985
batch 400 In 937.5 accuracy = 89.2846
batch 500 In 937.5 accuracy = 89.2340
batch 600 In 937.5 accuracy = 89.1691
batch 700 In 937.5 accuracy = 89.1049
batch 800 In 937.5 accuracy = 89.2069
batch 900 In 937.5 accuracy = 89.2602
Final result with the model on the dataset, accuracy = 89.22

(3) Validation on test set

# Evaluate the model and test its accuracy on the training set
correct_dataset  = 0
total_dataset    = 0
accuracy_dataset = 0.0

# The network does not update the gradient during evaluation
with torch.no_grad():
    for i, data in enumerate(test_loader):
        #Get a batch sample“ 
        images, labels = data
        
        #Predict all samples in batch
        outputs = model(images)
        
        #Select the most likely classification for the prediction results of each sample in batch
        _, predicted = torch.max(outputs.data, 1)
        
        #Accumulate the number of samples in batch
        total_dataset += labels.size()[0] 
        
        #Compare all results in batch“
        bool_results = (predicted == labels)
        
        #Count the number of correctly predicted samples
        correct_dataset += bool_results.sum().item()
        
        #Accuracy of statistical prediction of correct samples
        accuracy_dataset = 100 * correct_dataset/total_dataset
        
        if(i % 100 == 0):
            print('batch {} In {} accuracy = {:.4f}'.format(i, len(test_data)/64, accuracy_dataset))
            
print('Final result with the model on the dataset, accuracy =', accuracy_dataset)
batch 0 In 156.25 accuracy = 89.0625
batch 100 In 156.25 accuracy = 90.5631
Final result with the model on the dataset, accuracy = 89.93

remarks:

As can be seen from the figure above, the accuracy of a simple fully connected network is about 90%.

If more accurate training is needed, convolutional neural network needs to be used.

Chapter 4 model deployment

4.1 step 4-1: model storage

#Storage model
torch.save(model, "models/boston_net.pkl")

#Storage parameters
torch.save(model.state_dict() , "models/boston_params.pkl")

four point two   Step 4-2: model loading

Author home page( Silicon based workshop of slow fire rock sugar): Slow fire rock sugar (Wang Wenbing) blog silicon based workshop of slow fire rock sugar _csdnblog

Website of this article: https://blog.csdn.net/HiWangWenBing/article/details/120607797

Tags: neural networks Pytorch Deep Learning

Posted on Tue, 05 Oct 2021 19:23:46 -0400 by Jackount