pytorch tutorial -- building neural networks

abstract

Neural networks consist of layers / modules that perform operations on data. The torch.nn namespace provides all the building blocks needed to build your own neural network. Each module in PyTorch is a subclass of nn.Module. Neural network is a module itself, which is composed of other modules (layers). This nested structure allows you to easily build and manage complex architectures.
In the following section, we will build a neural network to classify the images in the FashionMNIST dataset.

import os
import torch
from torch import nn
from torch.utils.data import DataLoader
from torchvision import datasets, transforms

Get equipment for training
We hope to be able to train our model (if available) on hardware accelerators such as GPU. Let's check if torch.cuda is available, otherwise we will continue to use the CPU.

device = 'cuda' if torch.cuda.is_available() else 'cpu'
print('Using {} device'.format(device))

Output:

Define class

We define our neural network by inheriting nn.Module_ init_ Initialize the neural network layer in. Each nn.Module subclass implements the operation on the input data in the forward method.

class NeuralNetwork(nn.Module):
    def __init__(self):
        super(NeuralNetwork, self).__init__()
        self.flatten = nn.Flatten()
        self.linear_relu_stack = nn.Sequential(
            nn.Linear(28*28, 512),
            nn.ReLU(),
            nn.Linear(512, 512),
            nn.ReLU(),
            nn.Linear(512, 10),
        )

    def forward(self, x):
        x = self.flatten(x)
        logits = self.linear_relu_stack(x)
        return logits

We create an instance of NeuralNetwork, move it to the device, and print its structure.

model = NeuralNetwork().to(device)
print(model)


In order to use the model, we pass the input data to it. This will perform forwarding of the model, as well as some background operations. Do not directly call model.forward()!
Calling the model on the input returns a 10 dimensional tensor containing the original predicted value of each class. We obtain the prediction probability by passing it to the instance of nn.Softmax module.

X = torch.rand(1, 28, 28, device=device)
logits = model(X)
pred_probab = nn.Softmax(dim=1)(logits)
y_pred = pred_probab.argmax(1)
print(f"Predicted class: {y_pred}")

Output:

Model layer

Let's decompose the layers in the FashionMNIST model. To illustrate this, we will take a small batch of samples composed of three images with a size of 28x28 to see what happens when we deliver it over the network.

input_image = torch.rand(3,28,28)
print(input_image.size())

nn.Flatten

We initialize the nn.Flatten layer to convert each 2D 28x28 image into a continuous array of 784 pixel values (maintaining a small batch dimension (dim=0)).

flatten = nn.Flatten()
flat_image = flatten(input_image)
print(flat_image.size())

nn.Linear

The linear layer is a module that applies a linear transformation to the input using its stored weights and deviations.

layer1 = nn.Linear(in_features=28*28, out_features=20)
hidden1 = layer1(flat_image)
print(hidden1.size())

nn.ReLU

Nonlinear activation is the reason for creating complex mappings between the inputs and outputs of the model. They are applied after linear transformation to introduce nonlinearity and help neural networks learn a variety of phenomena.
In this model, we use nn.ReLU between the linear layers, but there are other activation functions that can introduce nonlinearity into the model.

print(f"Before ReLU: {hidden1}\n\n")
hidden1 = nn.ReLU()(hidden1)
print(f"After ReLU: {hidden1}")

nn.Sequential

nn.Sequential is an ordered module container. Data passes through all modules in the same order as defined. You can use sequential containers to combine a sequence like seq_modules such a fast network.

seq_modules = nn.Sequential(
    flatten,
    layer1,
    nn.ReLU(),
    nn.Linear(20, 10)
)
input_image = torch.rand(3,28,28)
logits = seq_modules(input_image)

nn.Softmax

The last linear layer of the neural network returns the original value in logits - [-infty, infty] - and passes it to the nn.Softmax module. logits are scaled to values [0, 1], representing the prediction probability of the model for each category. The dim parameter indicates a dimension where the sum of values must be 1.

softmax = nn.Softmax(dim=1)
pred_probab = softmax(logits)

model parameter

Many layers in the neural network are parameterized, that is, they have relevant weights and deviations optimized during training. The subclass nn.Module will automatically track all fields defined in the model object and use the parameters() or named of the model_ The PA {rameters() method makes all parameters accessible.
In this example, we iterate over each parameter and print a preview of its size and its value.

print("Model structure: ", model, "\n\n")

for name, param in model.named_parameters():
    print(f"Layer: {name} | Size: {param.size()} | Values : {param[:2]} \n")
Model structure:  NeuralNetwork(
  (flatten): Flatten(start_dim=1, end_dim=-1)
  (linear_relu_stack): Sequential(
    (0): Linear(in_features=784, out_features=512, bias=True)
    (1): ReLU()
    (2): Linear(in_features=512, out_features=512, bias=True)
    (3): ReLU()
    (4): Linear(in_features=512, out_features=10, bias=True)
  )
) 


Layer: linear_relu_stack.0.weight | Size: torch.Size([512, 784]) | Values : tensor([[-0.0302, -0.0320,  0.0341,  ..., -0.0228, -0.0337, -0.0105],
        [-0.0206,  0.0327,  0.0078,  ...,  0.0270,  0.0267,  0.0206]],
       device='cuda:0', grad_fn=<SliceBackward>) 

Layer: linear_relu_stack.0.bias | Size: torch.Size([512]) | Values : tensor([-0.0042,  0.0142], device='cuda:0', grad_fn=<SliceBackward>) 

Layer: linear_relu_stack.2.weight | Size: torch.Size([512, 512]) | Values : tensor([[ 3.0393e-05,  1.5742e-02,  1.5932e-02,  ...,  2.2430e-02,
          5.0596e-04,  2.0169e-02],
        [ 3.1222e-02, -5.3052e-03, -8.3699e-03,  ..., -5.5455e-03,
         -2.8178e-03,  7.5235e-03]], device='cuda:0', grad_fn=<SliceBackward>) 

Layer: linear_relu_stack.2.bias | Size: torch.Size([512]) | Values : tensor([0.0293, 0.0213], device='cuda:0', grad_fn=<SliceBackward>) 

Layer: linear_relu_stack.4.weight | Size: torch.Size([10, 512]) | Values : tensor([[ 0.0160,  0.0135, -0.0226,  ...,  0.0341, -0.0118, -0.0081],
        [-0.0136, -0.0039, -0.0421,  ..., -0.0386,  0.0155, -0.0322]],
       device='cuda:0', grad_fn=<SliceBackward>) 

Layer: linear_relu_stack.4.bias | Size: torch.Size([10]) | Values : tensor([-0.0071,  0.0143], device='cuda:0', grad_fn=<SliceBackward>) 

Tags: neural networks Pytorch Deep Learning

Posted on Mon, 13 Sep 2021 14:15:15 -0400 by Fuzzylr