Convolutional Neural Network-MNIST Actual Warfare (pytorch-based)

Article Directory

Preface

With the continuous development of in-depth learning, neural networks are becoming more and more popular. As we all know, cnn is very effective in image classification. This paper will show a simple convolution neural network model and test it with mnist dataset. In the second half of this article, the difficulties encountered in this practice and solutions will be given.

I. Preparations

First we need to import the package we need and set the necessary parameters.

The code is as follows:

import time
import numpy as np
from torchvision import transforms
from torchvision.datasets import mnist
from torch.utils.data import DataLoader
import matplotlib.pyplot as plt
import torch.nn as nn
import torch.optim as optim

# Define batch, which is the sample size of a training session
train_batch_size = 128
test_batch_size = 128

2. Importing datasets

1. Download datasets

The code is as follows:

# Define image data conversion operations
# ToTensor(): [0,255]->[C, H, W]; Normalize: Standardization (mean + standard deviation)
transform = transforms.Compose([transforms.ToTensor(),
                                transforms.Normalize([0.5],[0.5])])

# Download the mnist dataset and, if downloaded, define download as False
data_train = mnist.MNIST('./data', train=True, transform=transform,
                         target_transform=None, download=False)
data_test = mnist.MNIST('./data', train=False, transform=transform,
                        target_transform=None, download=False)

2. Loading datasets

The code is as follows:

# Load data using batch _size to confirm the size of each package and Shuffle to confirm the disruption of the dataset order.
train_loader = DataLoader(data_train, batch_size=train_batch_size, shuffle=True)
test_loader = DataLoader(data_test,batch_size=test_batch_size,shuffle=True)

  3. Data Visualization

Note that this step is not necessary, but you can check whether the dataset was successfully imported and visually reflect the contents of the dataset.

The code is as follows:

# Visualizing data
examples = enumerate(test_loader)
batch_idx, (example_data, example_targets) = next(examples)
plt.figure(figsize=(9, 9))
for i in range(9):
    plt.subplot(3, 3, i+1)
    plt.title("Ground Truth:{}".format(example_targets[i]))
    plt.imshow(example_data[i][0], cmap='gray', interpolation='none')
    plt.xticks([])
    plt.yticks([])
plt.show()

  Running results show:

 

  As you can see, nine grayscale pictures of handwritten digits are output. Because it is a grayscale image, this model will use single channel transmission     Enter.

4. Modeling

1.cnn Framework Construction

You can see that this is a 2+2 cnn model with two convolution layers and two fully connected layers.

# Set up CNN network
class CNN(nn.Module):
    def __init__(self):
        super(CNN, self).__init__()

        # Convolution layer
        self.conv1 = nn.Sequential(
            # [b,28,28,1]->[b,28,28,16]->[b,14,14,16]
            nn.Conv2d(1, 16, kernel_size=(3, 3), stride=(1, 1), padding=1),
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=2, stride=2),
        )

        self.conv2 = nn.Sequential(
            # [b,14,14,16]->[b,14,14,32]->[b,7,7,32]
            nn.Conv2d(16, 32, kernel_size=(3, 3), stride=(1, 1), padding=1),
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=2, stride=2),
        )

        # Full Connection Layer
        self.dense = nn.Sequential(
            # Linear Classifier
            # []
            nn.Linear(7 * 7 * 32, 128),
            nn.ReLU(),
            nn.Dropout(p=0.5),  # Alleviate overfitting and regularize to some extent
            nn.Linear(128, 10),
        )

    # Forward calculation
    def forward(self, x):
        x = self.conv1(x)
        x = self.conv2(x)
        x = x.view(x.size(0), -1)  # flatten tensor flattening for full connection layer reception
        return self.dense(x)

2. Model instantiation

The code is as follows:

model = CNN()   # Instantiate Model
print(model)    # Print model

  Results display:

5. Training Model

  The code is as follows:

# Set number of training sessions
num_epochs = 10
# Define loss function
criterion = nn.CrossEntropyLoss()
# Define learning rates and optimization methods
LR = 0.01
optimizer = optim.Adam(model.parameters(), LR)

# Start training by defining an array of storage loss functions and accuracy rates
train_losses = []
train_acces = []
# test
eval_losses = []
eval_acces = []

print("start training...")
# Record training start time
start_time = time.time()

# Training model
for epoch in range(num_epochs):

    # Training Set:
    train_loss = 0
    train_acc = 0

    # Set model to training mode
    model.train()

    for img, label in train_loader:
        out = model(img)    # Returns the probability for each category
        loss = criterion(out, label)   # Loss compared to actual label

        optimizer.zero_grad()   # Model Parameter Gradient Zeroing
        loss.backward()    # Error Reverse Transfer
        optimizer.step()    # Update parameters

        train_loss += loss    # Accumulated error

        _, pred = out.max(1)    # Number returning maximum probability
        num_correct = (pred == label).sum().item()  # Number of correct record labels
        acc = num_correct / img.shape[0]
        train_acc += acc

    # Average deposit
    train_losses.append(train_loss / len(train_loader))
    train_acces.append(train_acc / len(train_loader))

    # Test Set:
    eval_loss = 0
    eval_acc = 0

    # Set model to test mode
    model.eval()

    # Processing method as above
    for img, label in test_loader:
        out = model(img)
        loss = criterion(out, label)

        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

        eval_loss += loss

        _, pred = out.max(1)
        num_correct = (pred == label).sum().item()  # Number of correct record labels
        acc = num_correct / img.shape[0]
        eval_acc += acc

    eval_losses.append(eval_loss / len(test_loader))
    eval_acces.append(eval_acc / len(test_loader))

    # Output Effect
    print('epoch:{},Train Loss:{:.4f},Train Acc:{:.4f},'
          'Test Loss:{:.4f},Test Acc:{:.4f}'
          .format(epoch, train_loss / len(train_loader),
                  train_acc / len(train_loader),
                  eval_loss / len(test_loader),
                  eval_acc / len(test_loader)))
    # Output Duration
    stop_time = time.time()
    print("time is:{:.4f}s".format(stop_time-start_time))
print("end training.")

  Results display:

After 10 iterations, you can see that the results of the model are impressive, and the highest accuracy of the test can reach 99%.

VI. Visualization of Losses

Note that this step can also be omitted, but we want to see the effect more intuitively, so visualize the loss.

The code is as follows:

# Loss Visualization
plt.title("train_loss")
plt.plot(np.arange(len(train_losses)), train_losses)
plt.show()

plt.title("eval_loss")
plt.plot(np.arange(len(eval_losses)), eval_losses)
plt.show()

  Effect display:

 

As you can see, with a new round of iteration, the loss value of training decreases considerably, but there is a convex point in the middle of the loss value of the test, which then decreases to a lower point.  

 

summary

Reference resources


Parameter optimization method of product neural network - adjusting network structure is key!!! You just keep adding layers until the test error no longer decreases. - bonelee - Blog Park

End

That's what we're going to show today. There are many deficiencies in this article. Please correct them!

Tags: Python neural networks Pytorch Deep Learning Convolutional Neural Networks

Posted on Mon, 06 Dec 2021 13:56:59 -0500 by naboth_abaho