[Flower Book Notes | Torch Edition] Manual in-depth learning 201: Preparatory knowledge

2021.11.20 New pit!!!
This article includes content: 2 chapters Preparatory knowledge: 2.1 Mathematical operations, 2.2 Data preprocessing, 2.3 linear algebra

2 Preparatory knowledge

2.1 Data operations

2.1.1 Getting Started

  • Creating torch-type data
import torch
x = torch.arange(12)
x
tensor([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11])
  • Using reshape() to deform
x.reshape(3,4)
tensor([[ 0,  1,  2,  3],
        [ 4,  5,  6,  7],
        [ 8,  9, 10, 11]])
x
tensor([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11])
  • We found that operations are temporary, so to change shape, give a name (which can be the same name)

  • torch is tensor, a tensor, which can be interpreted as the sum of vectors (described later)

  • Viewing the morphology of a tensor with.shape

x.shape
torch.Size([12])
  • Number of elements in.numel()
x.numel()
12
  • Assign 0 to 1
torch.zeros((2, 3, 4))
tensor([[[0., 0., 0., 0.],
         [0., 0., 0., 0.],
         [0., 0., 0., 0.]],

        [[0., 0., 0., 0.],
         [0., 0., 0., 0.],
         [0., 0., 0., 0.]]])
torch.ones((2, 3, 4))
tensor([[[1., 1., 1., 1.],
         [1., 1., 1., 1.],
         [1., 1., 1., 1.]],

        [[1., 1., 1., 1.],
         [1., 1., 1., 1.],
         [1., 1., 1., 1.]]])
  • randn() random assignment
  • Each of these elements is randomly sampled from a standard Gaussian (normal) distribution with a mean of 0 and a standard deviation of 1.
torch.randn(3, 4)
tensor([[ 2.5246,  0.2407, -1.5535, -0.1901],
        [-0.1046, -0.9095, -0.0242,  2.6962],
        [ 1.3211,  2.3679, -0.0362,  0.2089]])
  • Create Tensor Directly
torch.tensor([[2, 1, 4, 3], [1, 2, 3, 4], [4, 3, 2, 1]])
tensor([[2, 1, 4, 3],
        [1, 2, 3, 4],
        [4, 3, 2, 1]])

2.1.2 Operation

  • Generally speaking, operations are performed between elements
x = torch.tensor([1.0, 2, 4, 8])
y = torch.tensor([2, 2, 2, 2])
x + y, x - y, x * y, x / y, x ** y  # ** Operator is exponentiation
(tensor([ 3.,  4.,  6., 10.]),
 tensor([-1.,  0.,  2.,  6.]),
 tensor([ 2.,  4.,  8., 16.]),
 tensor([0.5000, 1.0000, 2.0000, 4.0000]),
 tensor([ 1.,  4., 16., 64.]))
  • Exponential operation of e with torch.exp()
torch.exp(x)
tensor([2.7183e+00, 7.3891e+00, 5.4598e+01, 2.9810e+03])
  • Connect two tensors with torch.cat(), both of which are tensors
  • dim = 0 is the first reference coordinate system operation [row operation]
  • dim = 1 is the second reference coordinate system operation [column operation]
X = torch.arange(12, dtype=torch.float32).reshape((3,4))
Y = torch.tensor([[2.0, 1, 4, 3], [1, 2, 3, 4], [4, 3, 2, 1]])
torch.cat((X, Y), dim=0), torch.cat((X, Y), dim=1)
(tensor([[ 0.,  1.,  2.,  3.],
         [ 4.,  5.,  6.,  7.],
         [ 8.,  9., 10., 11.],
         [ 2.,  1.,  4.,  3.],
         [ 1.,  2.,  3.,  4.],
         [ 4.,  3.,  2.,  1.]]),
 tensor([[ 0.,  1.,  2.,  3.,  2.,  1.,  4.,  3.],
         [ 4.,  5.,  6.,  7.,  1.,  2.,  3.,  4.],
         [ 8.,  9., 10., 11.,  4.,  3.,  2.,  1.]]))
  • At the same time, we can do bool operations, which are operations between corresponding elements
X == Y
tensor([[False,  True, False,  True],
        [False, False, False, False],
        [False, False, False, False]])

2.1.3 Broadcast Mechanism

  • When the two tensor structures are different, we can still perform element-by-element operations by calling the broadcasting mechanism
a = torch.arange(3).reshape((3, 1))
b = torch.arange(2).reshape((1, 2))
a, b
(tensor([[0],
         [1],
         [2]]),
 tensor([[0, 1]]))
a + b
tensor([[0, 1],
        [1, 2],
        [2, 3]])
  • Expand one or two arrays by copying elements appropriately so that after conversion, the two tensors have the same shape

2.1.4 Index and Slice

X
tensor([[ 0.,  1.,  2.,  3.],
        [ 4.,  5.,  6.,  7.],
        [ 8.,  9., 10., 11.]])

This is similar to Python usage, so there's not much to explain. Take an example

  • If not, we default to the first reference system (row) when splitting
X[-1], X[1:3]
(tensor([ 8.,  9., 10., 11.]),
 tensor([[ 4.,  5.,  6.,  7.],
         [ 8.,  9., 10., 11.]]))
  • Yes, then row before column
X[1, 2] = 9
X
tensor([[ 0.,  1.,  2.,  3.],
        [ 4.,  5.,  9.,  7.],
        [ 8.,  9., 10., 11.]])
X[0:2, :]
tensor([[0., 1., 2., 3.],
        [4., 5., 9., 7.]])

2.1.5 Format Conversion

  • mnumpy and tensor
A = X.numpy()
B = torch.tensor(A)
type(A), type(B)
(numpy.ndarray, torch.Tensor)
  • Tensors are converted to numpy array form by.numpy()

  • Arrays are converted to tensor format by torch.tensor()

  • **Special case: ** If a tensor of size 1, that is, a number, it can be converted to Pytho scalar format through the item function and built-in function

a = torch.tensor([3.5])
a, a.item(), float(a), int(a)
(tensor([3.5000]), 3.5, 3.5, 3)

2.2 Data Processing

2.2.1 Read Datasets

  • os.makedirs() created a folder to ensure that the directory.../data exists
  • exist_ok=True creates a directory only if it does not exist and does not throw an exception if it already exists.
import os

os.makedirs(os.path.join('..', 'data'), exist_ok=True)
  • Create a file called house_in the directory Tiny.csv, which is named data_file
  • Open this file and write f.write()
data_file = os.path.join('..', 'data', 'house_tiny.csv')
with open(data_file, 'w') as f:
    f.write('NumRooms,Alley,Price\n')  # Column Name
    f.write('NA,Pave,127500\n')  # Each row represents a data sample
    f.write('2,NA,106000\n')
    f.write('4,NA,178100\n')
    f.write('NA,NA,140000\n')
  • Use read_from the Pandas Library CSV to read files
import pandas as pd

data = pd.read_csv(data_file)
data
NumRoomsAlleyPrice
0NaNPave127500
12.0NaN106000
24.0NaN178100
3NaNNaN140000
  • The format is Pandas's DataFrame format
type(data)
pandas.core.frame.DataFrame
  • Processing data:

    1. Divide data into inputs (x) and outputs (y)

inputs, outputs = data.iloc[:, 0:2], data.iloc[:, 2]
inputs
NumRoomsAlley
0NaNPave
12.0NaN
24.0NaN
3NaNNaN

The.iloc operation is a row number, a column number

outputs
0    127500
1    106000
2    178100
3    140000
Name: Price, dtype: int64

(2) Complete NaN, we use the mean of.fillna() for numbers and pd.get_for non-numbers. How Dummies () implements one-hot encode

dummy_na=True: bool, default False adds a list of empty values, ignoring if False

inputs = inputs.fillna(inputs.mean())
print(inputs)
   NumRooms Alley
0       3.0  Pave
1       2.0   NaN
2       4.0   NaN
3       3.0   NaN
inputs = pd.get_dummies(inputs, dummy_na=True)
print(inputs)
   NumRooms  Alley_Pave  Alley_nan
0       3.0           1          0
1       2.0           0          1
2       4.0           0          1
3       3.0           0          1

(3) Now all numbers are converted to tensors

x, y = torch.tensor(inputs.values), torch.tensor(outputs.values)
x, y
(tensor([[3., 1., 0.],
         [2., 0., 1.],
         [4., 0., 1.],
         [3., 0., 1.]], dtype=torch.float64),
 tensor([127500, 106000, 178100, 140000]))
  • .values is an index that removes two coordinate systems

2.3 Linear Algebra

Understanding of 2.3.1 Tensor

Just as vectors are generalizations of scalars and matrices are generalizations of vectors, we can construct data structures with more axes. Tensors (the "tensor" in this section refers to algebraic objects) provide a general method for describing n-dimensional arrays with any number of axes. For example, a vector is a first-order tensor and a matrix is a second-order tensor.

  • Tensor of an element: scalar
x = torch.tensor([3.0])
y = torch.tensor([2.0])

x + y, x * y, x / y, x**y
(tensor([5.]), tensor([6.]), tensor([1.5000]), tensor([9.]))
  • One-dimensional scalar: vector
x = torch.arange(4)
x
tensor([0, 1, 2, 3])
  • Two-dimensional scalar: array
A = torch.arange(20).reshape(5, 4)
A
tensor([[ 0,  1,  2,  3],
        [ 4,  5,  6,  7],
        [ 8,  9, 10, 11],
        [12, 13, 14, 15],
        [16, 17, 18, 19]])
  • Matrix transpose
A.T
tensor([[ 0,  4,  8, 12, 16],
        [ 1,  5,  9, 13, 17],
        [ 2,  6, 10, 14, 18],
        [ 3,  7, 11, 15, 19]])
  • Determine if it is a symmetric matrix
B = torch.tensor([[1, 2, 3], [2, 0, 4], [3, 4, 5]])
B == B.T
tensor([[True, True, True],
        [True, True, True],
        [True, True, True]])

Basic Properties of 2.3.2 Tensors

Is to operate on the corresponding elements

A = torch.arange(20, dtype=torch.float32).reshape(5, 4)
B = A.clone()  # Allocate a copy of A to B by allocating new memory
A, A + B
(tensor([[ 0.,  1.,  2.,  3.],
         [ 4.,  5.,  6.,  7.],
         [ 8.,  9., 10., 11.],
         [12., 13., 14., 15.],
         [16., 17., 18., 19.]]),
 tensor([[ 0.,  2.,  4.,  6.],
         [ 8., 10., 12., 14.],
         [16., 18., 20., 22.],
         [24., 26., 28., 30.],
         [32., 34., 36., 38.]]))

2.3.3 Dimension Reduction

Actually, what is meant is to operate on the dimension

  • The operation of the sum, when not specified, operates on all elements
  • If axis = 0, that is, to reduce the dimension according to the first dimension (row), add up each row at the same location
  • If anxi = 1, second dimension, column, same as above
A, A.shape, A.sum()
(tensor([[ 0.,  1.,  2.,  3.],
         [ 4.,  5.,  6.,  7.],
         [ 8.,  9., 10., 11.],
         [12., 13., 14., 15.],
         [16., 17., 18., 19.]]),
 torch.Size([5, 4]),
 tensor(190.))
A_sum_axis0 = A.sum(axis=0)
A_sum_axis0, A_sum_axis0.shape
(tensor([40., 45., 50., 55.]), torch.Size([4]))
A_sum_axis1 = A.sum(axis=1)
A_sum_axis1, A_sum_axis1.shape
(tensor([ 6., 22., 38., 54., 70.]), torch.Size([5]))
  • So summing all elements is the sum along one-dimensional and two-dimensional, which is equivalent to:
A.sum(axis=[0, 1])  # Same as `A.sum()`
tensor(190.)
  • Other operations: mean of all elements = sum of all elements / number of elements
A.mean(), A.sum() / A.numel()
(tensor(9.5000), tensor(9.5000))
  • Mean tensor along direction==sum/number of directions
A.mean(axis=0), A.sum(axis=0) / A.shape[0]
(tensor([ 8.,  9., 10., 11.]), tensor([ 8.,  9., 10., 11.]))

2.3.4 Non-dimensional Sum

We found that the operation just now changed from a two-dimensional tensor to a one-dimensional one, reducing the dimension

However, sometimes it is useful to keep the number of axes constant when calling a function to calculate a sum or mean.

  • We use keepdims=True to keep the original state
sum_A = A.sum(axis=1, keepdims=True)
sum_A
tensor([[ 6.],
        [22.],
        [38.],
        [54.],
        [70.]])
  • The original dimension reduction of columns is that many columns are now listed, but this operation still maintains the two-dimensional vector (tensor)

  • We can do something we want

A / sum_A
tensor([[0.0000, 0.1667, 0.3333, 0.5000],
        [0.1818, 0.2273, 0.2727, 0.3182],
        [0.2105, 0.2368, 0.2632, 0.2895],
        [0.2222, 0.2407, 0.2593, 0.2778],
        [0.2286, 0.2429, 0.2571, 0.2714]])

2.3.5 Dot Product

Point product: is the sum of product by elements at the same location

  • Implement with torch.dot
x = torch.arange(4, dtype = torch.float32)
y = torch.ones(4, dtype = torch.float32)
x, y, torch.dot(x, y)
(tensor([0., 1., 2., 3.]), tensor([1., 1., 1., 1.]), tensor(6.))
  • Realization by summation
torch.sum(x * y)
tensor(6.)

2.3.6 Vector Product

  • Vector multiplied by vector using torch.mv
A, A.shape, x, x.shape, torch.mv(A, x)
(tensor([[ 0.,  1.,  2.,  3.],
         [ 4.,  5.,  6.,  7.],
         [ 8.,  9., 10., 11.],
         [12., 13., 14., 15.],
         [16., 17., 18., 19.]]),
 torch.Size([5, 4]),
 tensor([0., 1., 2., 3.]),
 torch.Size([4]),
 tensor([ 14.,  38.,  62.,  86., 110.]))
  • The column dimension of A (the length along axis 1) must be the same as the dimension of x (its length)
k = torch.arange(5, dtype = torch.float32)
torch.mv(A, k)
---------------------------------------------------------------------------

RuntimeError                              Traceback (most recent call last)

<ipython-input-76-b5e96cf0ea5c> in <module>
      1 k = torch.arange(5, dtype = torch.float32)
----> 2 torch.mv(A, k)


RuntimeError: size mismatch, got 5, 5x4,5

2.3.7 Matrix Multiplication

  • Using torch.mm(A, B)
B = torch.ones(4, 3)
A,B,torch.mm(A, B)
(tensor([[ 0.,  1.,  2.,  3.],
         [ 4.,  5.,  6.,  7.],
         [ 8.,  9., 10., 11.],
         [12., 13., 14., 15.],
         [16., 17., 18., 19.]]),
 tensor([[1., 1., 1.],
         [1., 1., 1.],
         [1., 1., 1.],
         [1., 1., 1.]]),
 tensor([[ 6.,  6.,  6.],
         [22., 22., 22.],
         [38., 38., 38.],
         [54., 54., 54.],
         [70., 70., 70.]]))
C = torch.ones(5, 3)
A,B,torch.mm(A, C)
---------------------------------------------------------------------------

RuntimeError                              Traceback (most recent call last)

<ipython-input-78-2f52d201a633> in <module>
      1 C = torch.ones(5, 3)
----> 2 A,B,torch.mm(A, C)


RuntimeError: mat1 and mat2 shapes cannot be multiplied (5x4 and 5x3)
  • Explanation must also be formatted to multiply matrices

2.3.8 Paradigm

  • L 2 L_2 The L2 norm is the square root of the sum of squares of vector elements:

  • Implement with torch.norm()

u = torch.tensor([3.0, -4.0])
torch.norm(u)
tensor(5.)
  • L 1 L_1 L1 The norm is the sum of the absolute values of the vector elements:
torch.abs(u).sum()
tensor(7.)
  • Frobenius norm is the square root of the sum of squares of matrix elements:

  • Like the L2 paradigm, it is a two-dimensional extension, so torch.norm() is also used.

torch.norm(torch.ones((4, 9)))
tensor(6.)

Although we don't want to go too far, we can have some intuition as to why these concepts are useful. In deep learning, we often try to solve optimization problems: maximizing the probability assigned to the observation data; Minimize the distance between predictions and real observations. Vectors are used to represent items, such as words, products, or news articles, to minimize the distance between similar items and maximize the distance between different items. Often, goals, perhaps the most important component of deep learning algorithms (in addition to data), are expressed as norms.

Tags: Python Data Analysis

Posted on Sun, 28 Nov 2021 21:53:59 -0500 by LMarie