Building a Neural Network with Pytorch

About torch.nn

_Use Pytorch to build neural networks, the main tools are in the torch.nn package

nn relies on autograd to define the model and derive it automatically

Typical processes for building neural networks

Define a neural network with learnable parameters

Traversing training datasets

Process input data to flow through a neural network

Calculating the value of loss

Reverse Propagation of Gradients of Network Parameters

Update the weights of the network with certain rules

First, define a Pytorch-implemented neural network

# Import several Toolkits
import torch
import torch.nn as nn
import torch.nn.functional as F


# Define a simple network class
class Net(nn.Module):  # An initialization function
    def __init__(self):
        super(Net, self).__init__()
        # Define the first layer of convolution neural network, input channel dimension=1, output channel dimension=6, convolution and size 3*3
        self.conv1 = nn.Conv2d(1, 6, 3)
        # Define the second layer convolution neural network with input channel dimension = 6, output channel dimension = 16, convolution and size 3*3
        self.conv2 = nn.Conv2d(6, 16, 3)
        # Define a three-tier fully connected network
        self.fc1 = nn.Linear(16 * 6 * 6, 120)
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, 10)

    def forward(self, x):
        # Perform maximum pooling under (2,2) pooling window
        x = F.max_pool2d(F.relu(self.conv1(x)), (2, 2))
        x = F.max_pool2d(F.relu(self.conv2(x)), 2)
        x = x.view(-1, self.num_flat_features(x))
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
        return x

    # Flattening Dimensions
    def num_flat_features(self, x):
        # Calculate size, except batch_on dimension 0 Size
        size = x.size()[1:]
        num_features = 1
        for s in size:
            num_features *= s
        return num_features


net = Net()
print(net)

Output results

Net(
  (conv1): Conv2d(1, 6, kernel_size=(3, 3), stride=(1, 1))
  (conv2): Conv2d(6, 16, kernel_size=(3, 3), stride=(1, 1))
  (fc1): Linear(in_features=576, out_features=120, bias=True)
  (fc2): Linear(in_features=120, out_features=84, bias=True)
  (fc3): Linear(in_features=84, out_features=10, bias=True)
)

Attention

        · All trainable parameters in the model can be obtained by net.parameters()

params = list(net.parameters())#Encapsulate with list
print(len(params))
print(params[0].size())

Output results

10
torch.Size([6, 1, 3, 3])

/ Assume that the input size of the image is 32*32:

input = torch.randn(1, 1, 32, 32)
out = net(input)
print(out)

Output results

tensor([[ 0.1065,  0.0852,  0.0484,  0.0806, -0.0398, -0.0307, -0.1036, -0.0510,
         -0.1005,  0.0150]], grad_fn=<AddmmBackward>)

With the output tensor, you can perform gradient zero and reverse propagation operations.

net.zero_grad()
out.backward(torch.randn(1, 10))

Attention

The neural network constructed by. torch.nn only supports the input of mini-batches and does not support the input of a single sample.

For example, nn.Conv2d requires a 4D Tensor in the shape of (nSamples,nChannels,Height,Width). If your input is in a single sample form, you need to execute input.unsqueeze(0) to actively extend the 3D Tensor to a 4D Tensor.

loss function

The input to the loss function is a pair of inputs: (output,target), and a numerical value is calculated to evaluate the difference between output and target.

There are several different loss functions available in torch.nn, such as nn.MSELoss, which evaluates the difference between the input and the target values by calculating the mean variance loss.

An example of using nn.MESLoss to calculate losses:

output = net(input)
target = torch.randn(10)
# Change the shape of the target to a two-dimensional tensor to match output
target = target.view(1, -1)
criterion = nn.MSELoss()
loss = criterion(output, target)
print(loss)

Output results

tensor(0.8401, grad_fn=<MseLossBackward>)

_Chains about Directional Propagation: If we track the direction of loss reverse propagation, use grad_ If you print the FN attribute, you will see a complete calculation as follows:

input -> conv2d -> relu -> maxpool2d -> conv2d -> relu -> maxpool2d
      -> view -> linear -> relu -> linear -> relu -> linear
      -> MSELoss
      -> loss

When loss.backward() is called, the entire calculation graph automatically derives loss, and Tensors with all the attributes required-grad=True participate in the gradient derivation operation and add the gradient to the.Grad attribute in Tensors.

print(loss.grad_fn)# MSELoss
print(loss.grad_fn.next_functions[0][0])# Linear
print(loss.grad_fn.next_functions[0][0].next_functions[0][0])# ReLU

Output results

<MseLossBackward object at 0x000001BDFF5C7E80>
<AddmmBackward object at 0x000001BDFFB42710>
<AccumulateGrad object at 0x000001BDFFB42710>

backpropagation

/ Reverse propagation in Pytorch is very easy, the whole operation is loss.backward().

Before performing reverse propagation, the gradient must be cleared, otherwise the gradient will be accumulated between different batches of data.

. Perform a small example of reverse propagation

#Code to perform gradient zeroing in Pytorch
net.zero_grad()
print('conv1.bias.grad before backward')
print(net.conv1.bias.grad)
#Code to perform reverse propagation in Pytorch
loss.backward()
print('conv1.bias.grad after backward')
print(net.conv1.bias.grad)

Output results

conv1.bias.grad before backward
tensor([0., 0., 0., 0., 0., 0.])
conv1.bias.grad after backward
tensor([-0.0007,  0.0024,  0.0136,  0.0216,  0.0032,  0.0132])

Update network parameters

The easiest algorithm to update parameters is SGD (random gradient descent)

. Specific algorithm formula expression is: weight = weight - learning_rate * gradient

First, SGD is implemented using traditional Python code as follows:

learning_rate = 0.01
for f in net.parameters():
    f.data.sub_(f.grad.data * learning_rate)

* Then use the standard code officially recommended by Pytorch as follows:

# First import the optimizer package, optim contains several commonly used optimization algorithms, such as SGD, Adam, etc.
import torch.optim as optim

# Creating optimizer objects through opotim
optimizer = optim.SGD(net.parameters(), lr=0.01)
# Perform gradient zeroing on the optimizer
optimizer.zero_grad()
output = net(input)
loss = criterion(output, target)
# Reverse propagation of loss values
loss.backward()
# Updates to parameters are performed through a standard line of code
optimizer.step()

summary

A typical process for building a neural network

        · Define a Neural Network with Learning Parameters

        · Traversing datasets

        · Processing input data to flow through a neural network

        · Calculate loss value

        · Reverse Propagation of Gradients of Network Parameters

        · Update the weights of the network with certain rules

Definition of loss function

        · Calculating mean square error using torch.nn.MSELoss()

        · When backward propagation calculations are performed with loss.backward(), the entire calculation diagram automatically derives loss, with all attributes required_ Tensors with grad=True will participate in the gradient derivation and will add the gradient to the.Grad attribute in Tensors.

The calculation method for reverse propagation

        · Reverse propagation in Pytorch is very easy, all operation is loss.backward().

        · Before performing a reverse operation, the gradient must be cleared, otherwise batch data with different gradients will be accumulated

                ·net.zero_grad()

                ·loss.backward()

Update methods for parameters

        · Define optimizers to perform parameter optimization and update

                ·optimizer = optim.SGD(net.parameters(), lr=0.01)

        · Perform specific parameter updates through the optimizer.

                ·optimizer.step()

Tags: neural networks Pytorch Deep Learning NLP

Posted on Sat, 23 Oct 2021 12:18:04 -0400 by bradleybebad