I learned the basic knowledge of linear model, back propagation and gradient descent two days ago,
And they are implemented in Python.
Today, learn about the complete implementation of Pytorch.
The whole is divided into four parts:
1. Prepare the data set.
2. Design model:
The model here calculates the predicted value of y.
3. Construct loss function and optimizer:
4. Write training cycle:
The example training model used in this section is:
The small cap above y here is the calculated predicted value of Y.
x = [[1.0],[2.0][3.0]]
Using the mini batch style, the results of the three samples are obtained at one time.
Write the formula in matrix form:
w and b will be expanded into a 3 * 1 matrix through the broadcast mechanism.
If the broadcasting mechanism here is not clear, you can look at me Another article.
Then calculate the loss:
Write him in matrix form:
Since both y (pre) and loss are matrices of 3 and 1, the given sample data y is also a matrix of 31. (the latter square squares each element in the matrix).
So when we initialize the data,
x = [1.0,2.0,3.0]
x = [[1.0],[2.0][3.0]]
Obviously, the former is one-dimensional and the latter is two-dimensional (3 * 1).
z = w*x+b is also called a linear element.
In pytorch, the calculation of gradient derivative and partial derivative no longer needs attention. The key point is to construct the calculation diagram.
After we get the dimensions of X and Y (pre) of the input sample, we can find the tensor dimensions of w and b.
Basic process: input X samples, calculate the flow chart to get y (pre), calculate loss, and then call backword.
The loss obtained at this time is a tensor, but the backword is the scalar of loss.
The loss tensor can be summed.
Model design in pytorch:
The inherited class must be this. At least these two functions are implemented in the class, and the name cannot be changed, and the name of forward cannot be changed!!!
If you implement other functions, such as manual design, backword can inherit other classes.
Call the parent class. The first parameter in the parent class is the class name.
Construct a linear object with tensor s of w and d.
It is equivalent to calling the class in pytorch to construct the linear object.
Torch.nn.linear (input sample dimension size, output sample dimension, offset required, default to True)
The dimensions of input samples and output samples must be consistent.
Then, each row of data [1,2], [2,3] [2,2], [1,1] here represents an example, and the number of columns represents the characteristic number of each example, that is, in_features . So the meaning of this matrix is: input four samples, and each sample is represented by two features.
When the instantiated object of the class name is called, the call method in the linear class object will be called directly.
In fact, the operation of y = w * x + b is done. Calculated y (pre)
class Fol(): def __init__(self): pass def __call__(self, *args, **kwargs): print("call Called") fol = Fol() print(fol())
Subsequent direct use:
After establishing a model object, we can use model(x). At this time, X is directly sent to the forward method of LinearModel class.
So y (pre) = model(x)
Construct loss function and optimizer:
To construct MSE loss function, y and Y (pre) are required, and then MSE formula is used to obtain a vector (tensor) of loss. Then the vector (tensor) is summed to get the scalar.
In pytorch, you can call the above function, which requires parameters y and Y (pre). Calling this function completes the above calculation steps.
size_ Whether average is required. Ask or not.
params weight will check all weighted values of linear and add them to the queue to be trained, lr improving the learning rate
Finally, cycle data for training:
The ultimate goal is to get the weight value w and offset b we want.
1. Calculate the predicted value first:
2. Calculated loss value:
3. Gradient clearing:
3. Back propagation + update weight value:
Finally, print weights and offsets:
Both of them are in tensor form in calculation. Pay attention to the output form.
After training, we get the weight value W and offset b we want.
At this time, you can add test data and test with w and b.
import torch import matplotlib.pyplot as plt class LinearModel(torch.nn.Module): def __init__(self): super(LinearModel, self).__init__() self.linear = torch.nn.Linear(1, 1) def forward(self, x): y_pre = self.linear(x) return y_pre model = LinearModel() criterion = torch.nn.MSELoss(reduction='sum') # loss function optimizer = torch.optim.SGD(model.parameters(), lr=0.01) # optimizer w_ll =  loss_ll =  if __name__ == '__main__': x_data = torch.Tensor([[1.0], [2.0],[3.0]]) y_data = torch.Tensor([[2.0], [4.0],[6.0]]) for i in range(1000): y_pre = model(x_data) # First y predicted value loss = criterion(y_pre, y_data) # Seek loss loss_ll.append(loss.item()) optimizer.zero_grad() # Gradient clearing loss.backward() # Back propagation optimizer.step() # Update weight value w_ll.append(model.linear.weight.item()) print(w_ll) print(loss_ll) # Drawing representation plt.rcParams['font.sans-serif'] = ['KaiTi'] plt.plot(w_ll,loss_ll) plt.xlabel("weight W") plt.ylabel("magnitude of the loss") plt.show() # test data print("Current W by:",model.linear.weight.item()) print("Current d by:",float(model.linear.bias.item())) x_text = torch.Tensor([4.0]) y_text = model(x_text) print(y_text.item())
Results can be obtained:
The weight value W is infinitely close to 2.
The offset d is infinitely close to 0.
When the test data x=4, the value of y is infinitely close to 8.