Summary of problems in learning pytorch framework

pytorch framework learning 1: Linear region, Logistic Regression, Softmax Classifier ...

pytorch framework learning

1: Linear region, Logistic Regression, Softmax Classifier

1. Model inheritance and construction

import torch from torch.autograd import Variable # data define(3*1) x_data = Variable(torch.Tensor([[1.0], [2.0], [3.0]])) y_data = Variable(torch.Tensor([[2.0], [4.0], [6.0]])) # model class class Model(torch.nn.Module): def __init__(self): # Define construction method """ In constructor instantiation In the constructor we instantiate two nn.Linear module """ super(Model, self).__init__() self.linear = torch.nn.Linear(1, 1) # one in and one out def forward(self, x): """ In the forward function we accept a Variable of input data and we must return a Variable of output data. we can use modules defined in the constructor as well as arbitrary operator on Variable. """ y_pred = self.linear(x) return y_pred # our model model = Model() # construct our loss function and an optimizer. # The call to model.parameters() in the SGD constructor will contain the learnable # parameters of the two nn.Linear modules which are members of the model. criterion = torch.nn.MSELoss(size_average=False) optimizer = torch.optim.SGD(model.parameters(), lr=0.01) # training loop for epoch in range(500): # forward pass: compute predicted y by passing x to the model y_pred = model(x_data) # compute and print loss loss = criterion(y_pred, y_data) print(epoch, loss.item()) # zero gradients, perform a backward pass, and update the weights optimizer.zero_grad() loss.backward() optimizer.step() # after training -- test hour_val = Variable(torch.Tensor([[4.0]])) print("predict (after training)", 4, model.forward(hour_val).data[0][0])
2: Model learning nn.module 1. Model establishment nn.module


1.1nn.Module

• parameters: storage management nn.Parameter class
• modules: storage management nn.Module class
• buffers: storage management buffer attributes, such as running in BN layer_ mean
• ***_ hooks: storage management hook function

1.2.nn.Module summary

• a module can contain multiple sub modules
• a module is equivalent to an operation, and the forward() function must be implemented
• each module has 8 dictionaries to manage its properties

2. Model Containers

2.1 nn.Sequential

nn.Sequential is the container of nn.module, which is used to wrap a set of network layers in order
• sequencing: each network layer is constructed in strict order
• built in forward(): in the built-in forward, forward propagation operations are performed successively through the for loop

2.2 nn.ModuleList

nn.ModuleList is the container of nn.module, which is used to wrap a set of network layers and call the network layer iteratively
Main methods:
• append(): add a network layer after the ModuleList
• extend(): splice two modulelists
• insert(): specify the location in the ModuleList to insert the network layer

2.3 nn.ModuleDict

nn.ModuleDict is the container of nn.module, which is used to wrap a set of network layers and call the network layer by index
Main methods:
• clear(): clear ModuleDict
• items(): returns key value pairs that can be iterated
• keys(): the key that returns the dictionary
• values(): returns the value of the dictionary
• pop(): returns a pair of key values and deletes them from the dictionary

2.4 vessel summary

Summary of Containers
• nn.Sequential: sequential. Each network layer is executed in strict order. It is often used for block construction
• nn.ModuleList: iterative. It is often used for the construction of a large number of duplicate nets, and the repeated construction is realized through the for loop
• nn.ModuleDict: indexability, commonly used for optional network layers

3: Weight initialization, loss function, optimizer 1. Weight initialization 1.1 gradient disappearance and explosion 1.2 Xavier initialization

Variance consistency: keep the data scale in an appropriate range, usually the variance is 1. Activation function: saturation function, such as Sigmoid and Tanh

# The activation function is Tanh a = np.sqrt(6 / (self.neural_num + self.neural_num)) tanh_gain = nn.init.calculate_gain('tanh') a *= tanh_gain nn.init.uniform_(m.weight.data, -a, a) ########################################## nn.init.xavier_uniform_(m.weight.data, gain=tanh_gain) # The xavier initialization method provided by pytorch is the same as the result of the manual calculation implementation above
1.3 Kaiming initialization

Variance consistency: keep the data scale in an appropriate range, usually the variance is 1. Activation function: ReLU and its variants

# nn.init.normal_(m.weight.data, std=np.sqrt(2 / self.neural_num)) nn.init.kaiming_normal_(m.weight.data) # The provided by pytorch is equivalent to the above statement
1.2 nn.init.calculate_gain

nn.init.calculate_gain(nonlinearity, param=None)
Main function: calculate the variance change scale of activation function
Main parameters:
• nonlinearity: name of the activation function
• param: parameters of the activation function, such as the negative of Leaky ReLU_ slop

x = torch.randn(10000) out = torch.tanh(x) gain = x.std() / out.std() print('gain:{}'.format(gain)) tanh_gain = nn.init.calculate_gain('tanh') print('tanh_gain in PyTorch:', tanh_gain)
2. Loss function 1,nn.CrossEntropyLoss


Function: nn.LogSoftmax() and nn.NLLLoss() are combined to calculate cross entropy
Main parameters:
• weight: set the weight of loss of each category
• ignore _index: ignore a category
• reduction: calculation mode, which can be none/sum/mean
none element by element calculation
Sum - sum all elements and return scalar
mean weighted average, return scalar

import torch import torch.nn as nn import torch.nn.functional as F import numpy as np # fake data inputs = torch.tensor([[1, 2], [1, 3], [1, 3]], dtype=torch.float) target = torch.tensor([0, 1, 1], dtype=torch.long) # ----------------------------------- CrossEntropy loss: reduction ----------------------------------- flag = 0 # flag = 1 if flag: # def loss function loss_f_none = nn.CrossEntropyLoss(weight=None, reduction='none') loss_f_sum = nn.CrossEntropyLoss(weight=None, reduction='sum') loss_f_mean = nn.CrossEntropyLoss(weight=None, reduction='mean') # By default # forward loss_none = loss_f_none(inputs, target) loss_sum = loss_f_sum(inputs, target) loss_mean = loss_f_mean(inputs, target) # view print("Cross Entropy Loss:\n ", loss_none, loss_sum, loss_mean) # --------------------------------- compute by hand flag = 0 # flag = 1 if flag: idx = 0 input_1 = inputs.detach().numpy()[idx] # [1, 2] target_1 = target.numpy()[idx] # [0] # First item x_class = input_1[target_1] # Item 2 sigma_exp_x = np.sum(list(map(np.exp, input_1))) log_sigma_exp_x = np.log(sigma_exp_x) # Output loss loss_1 = -x_class + log_sigma_exp_x print("First sample loss by: ", loss_1) # ----------------------------------- weight ----------------------------------- flag = 0 # flag = 1 if flag: # def loss function weights = torch.tensor([1, 2], dtype=torch.float) # weights = torch.tensor([0.7, 0.3], dtype=torch.float) loss_f_none_w = nn.CrossEntropyLoss(weight=weights, reduction='none') loss_f_sum = nn.CrossEntropyLoss(weight=weights, reduction='sum') loss_f_mean = nn.CrossEntropyLoss(weight=weights, reduction='mean') # forward loss_none_w = loss_f_none_w(inputs, target) loss_sum = loss_f_sum(inputs, target) loss_mean = loss_f_mean(inputs, target) # view print("\nweights: ", weights) print(loss_none_w, loss_sum, loss_mean)
D:\Anaconda3\envs\pytorch\python.exe D:/PythonProject/Eye of depth pytorch/04-02-code-loss function (one)/lesson-15/loss_function_1.py Cross Entropy Loss: tensor([1.3133, 0.1269, 0.1269]) tensor(1.5671) tensor(0.5224) weights: tensor([1., 2.]) tensor([1.3133, 0.2539, 0.2539]) tensor(1.8210) tensor(0.3642) Process finished with exit code 0
3. optimizer 3.1 concept

pytorch optimizer: manages and updates the values of learnable parameters in the model to make the model output closer to the real label

class Optimizer(object): def __init__(self, params, defaults): self.defaults = defaults self.state = defaultdict(dict) self.param_groups = [] ...... param_groups = [{'params': param_groups}]

Basic properties
• defaults: optimizer super parameters (learning rate, etc.)
• state: cache of parameters, such as momentum
• params_groups: managed parameter groups
• _ step_count: records the number of updates, which is used in learning rate adjustment

Basic method
• zero_grad(): clear the gradient of managed parameters (pytorch property: tensor gradient is not automatically cleared)
• step(): perform a one-step update
• add_param_group(): add parameter group
• state_dict(): get the current state information dictionary of the optimizer
• load_state_dict(): load status information dictionary

# -*- coding: utf-8 -*- import os import torch import torch.optim as optim from nn Network layer and volume layer n.code.tools.common_tools import set_seed BASE_DIR = os.path.dirname(os.path.abspath(__file__)) set_seed(1) # Set random seed weight = torch.randn((2, 2), requires_grad=True) weight.grad = torch.ones((2, 2)) optimizer = optim.SGD([weight], lr=0.1) # ----------------------------------- step ----------------------------------- flag = 0 # flag = 1 if flag: print("weight before step:{}".format(weight.data)) optimizer.step() # Modify the observation of lr=1 0.1 print("weight after step:{}".format(weight.data)) # ----------------------------------- zero_grad ----------------------------------- flag = 0 # flag = 1 if flag: print("weight before step:{}".format(weight.data)) optimizer.step() # Modify the observation of lr=1 0.1 print("weight after step:{}".format(weight.data)) print("weight in optimizer:{}\nweight in weight:{}\n".format(id(optimizer.param_groups[0]['params'][0]), id(weight))) print("weight.grad is {}\n".format(weight.grad)) optimizer.zero_grad() print("after optimizer.zero_grad(), weight.grad is\n{}".format(weight.grad)) # ----------------------------------- add_param_group ----------------------------------- # flag = 0 flag = 1 if flag: print("optimizer.param_groups is\n{}".format(optimizer.param_groups)) w2 = torch.randn((3, 3), requires_grad=True) optimizer.add_param_group({"params": w2, 'lr': 0.0001}) print("optimizer.param_groups is\n{}".format(optimizer.param_groups)) # ----------------------------------- state_dict ----------------------------------- flag = 0 # flag = 1 if flag: optimizer = optim.SGD([weight], lr=0.1, momentum=0.9) opt_state_dict = optimizer.state_dict() print("state_dict before step:\n", opt_state_dict) for i in range(10): optimizer.step() print("state_dict after step:\n", optimizer.state_dict()) torch.save(optimizer.state_dict(), os.path.join(BASE_DIR, "optimizer_state_dict.pkl")) # -----------------------------------load state_dict ----------------------------------- flag = 0 # flag = 1 if flag: optimizer = optim.SGD([weight], lr=0.1, momentum=0.9) state_dict = torch.load(os.path.join(BASE_DIR, "optimizer_state_dict.pkl")) print("state_dict before load state:\n", optimizer.state_dict()) optimizer.load_state_dict(state_dict) print("state_dict after load state:\n", optimizer.state_dict())
tensorboard usage

There is no runs file in the folder: tensorboard --logdir=./runs

There are runs files under the folder: tensorboard --logdir =/

20 September 2021, 10:32 | Views: 3876

Add new comment

For adding a comment, please log in
or create account

0 comments