Python artificial intelligence IV. Session, variable, incoming value and excitation function based on TensorFlow

Starting from this article, the author officially began to explain the knowledge related to Python in-depth learning, neural network and artificial intelligence. I hope you like it.

The previous article explained the TensorFlow foundation and the case of univariate linear prediction. This article will introduce Session, variables, incoming values and excitation functions in detail. Mainly combined with the author's previous blog and the video introduction of "don't bother God", the specific projects and applications will be explained later.

Basic articles, I hope to help you. If there are errors or deficiencies in the articles, please Haihan ~ at the same time, I am also a rookie of artificial intelligence. I hope you can grow up with me in this stroke by stroke blog.

Article directory:

  • 1, Tensor tensor
  • 2, Session
  • 3, Constants and variables
  • 4, placeholder incoming value
  • 5, Excitation function
  • 6, Summary

Code download address:

  • AI-for-TensorFlow
  • AI-for-Keras

1, Tensor tensor

Tensorflow Chinese translation is "vector flying", which is also the basic meaning of tensorflow. Tensorflow uses data flow graphs to perform numerical calculations. Data flow graph is a directed graph, which uses nodes (generally described by circles or squares, indicating the starting point of a mathematical operation or data input and the end point of data output) and lines (representing numbers, matrices or Tensor tensors) to describe mathematical calculations.

Data flow graph can easily assign each node to different computing devices to complete asynchronous parallel computing, which is very suitable for large-scale machine learning applications. As shown in the figure below, we continuously learn and improve our weight W and offset b through Gradients, so as to improve the accuracy.

Tensor is the basic data structure used by tensorflow framework. Tensor is multidimensional array, which can be understood as nested multidimensional list in python. The dimension of tensor is called order, 0-order tensor is also called scalar, 1-order tensor is also called vector, and 2-order tensor is also called matrix.

# Tensor scalar of order 0
# The tensor vector of order 1 is 3
[1., 2., 3.] 
# Second order tensor 2 * 3 matrix
[[1., 2., 3.], 
 [4., 5., 6.]]
# The third-order tensor is 2 * 3 * 2
[[[1., 2.],[3., 4.],[5., 6.]], 
 [[7., 8.],[9., 10.],[11., 12.]]] 

The code is as follows:

# -*- coding: utf-8 -*-
import tensorflow as tf 

#Define variables
a = tf.constant([1, 2, 3], name="a")
b = tf.constant([[1, 2, 3],[4, 5, 6]])

#Create arrays 0 and 1
c = tf.zeros([2, 3])
d = tf.ones([2,3])

#Randomly generate a normal distribution
e = tf.random.normal([5,3])

The output results are shown in the figure below:

2, Session

Tensor is actually a multidimensional array, which is the main data structure of TensorFlow. They flow in one or more graphs composed of nodes and edges. Edges represent tensors and nodes represent operations on tensors. Tensors flow from one node to another in the graph. Each time they pass through a node, they accept an operation.

In addition, the graph must be started in the session. The session distributes the graph's operations to devices such as CPU or GPU, and provides methods to perform operations (op). After these methods are executed, the generated tensor will be returned. TensorFlow programs are usually organized into a build phase and an execution phase:

  • In the build phase, the execution steps of the op are described as a diagram
  • During the execution phase, the op in the session execution diagram is used

For example, create a graph to represent and train the neural network in the construction phase, and then repeatedly execute the training op in the graph in the execution phase. All operations involved in TensorFlow should be placed in the graph, and the operation of the graph only occurs in the session. After the session is opened, the nodes can be filled with data and calculated; If the session is closed, the calculation cannot be performed. Sessions provide an environment for operations to run and Tensor evaluation.

Here is a simple example. We use the run() method of the Session object to perform multiplication, define two matrices matrix1 and matrix2, and then run them in the Session.

# -*- coding: utf-8 -*-
Created on Sat Nov 30 16:38:31 2019
@author: Eastmount CSDN YXZ

import tensorflow as tf

# Build two matrices
matrix1 = tf.constant([[3,3]]) #Constant 1 row 2 column
matrix2 = tf.constant([[2],
                       [2]])  #Constant 2 rows and 1 column

# Matrix multiply matrix multiply is similar to the function
product = tf.matmul(matrix1, matrix2)

# Two methods of Session control using Session
# Method 1
sess = tf.Session()
output = # Execute the operation. TensorFlow will execute the operation every time it run s

# Method 2
with tf.Session() as sess: # Open Session and assign it to sess, and close automatically after running
    output =

The output results are as follows:


3, Constants and variables

In TensorFlow, use tf.constant to create constants.

# Create 1 * 2 Matrix constant
c1 = tf.constant([[1., 1.]]) 
# Create 2 * 1 matrix constant
c2 = tf.constant([[2.],[2.]]) 

In TensorFlow, use tf.Variable to create variables. A variable is a special tensor whose value can be a tensor of any type and shape. Among them, the definition of variables is different from that in Python. For example, state = tf.Variable(), TensorFlow must be defined as a variable, which is a real variable.

# Create a variable of order 0 and initialize it to 0
state = tf.Variable(0, name='counter')

When creating a variable, you must pass a tensor as the initial value into the constructor Variable(). TensorFlow provides a series of operators to initialize the tensor, such as tf.random_normal and tf.zeros.

# A normal distribution with a standard deviation of 0.35 initializes a shape (10,20) variable
w = tf.Variable(tf.random_normal([10, 20], stddev=0.35), name="w")

Then implement a case, loop output variables. This code requires attention:

  • Variables must be initialized when they are defined. Use: global_variables_initializer() initialize_all_variables()
  • When defining a Session, be sure to use initialization before using it
# -*- coding: utf-8 -*-
Created on Sun Dec  1 16:52:18 2019
@author: Eastmount CSDN YXZ
import tensorflow as tf

# The initial value of the defined variable is 0, and the variable name is counter (used for counting)
state = tf.Variable(0, name='counter')

# Define constants
one = tf.constant(1)

# New variable
result = tf.add(state, one)

# Update: the result variable is loaded into state, and the current state variable is result
update = tf.assign(state, result)

# All variables need to be initialized in Tensorflow to activate
init = tf.global_variables_initializer() # must have if define variable

# Session
with tf.Session() as sess:
    # Update variables in three cycles
    for _ in range(3):
        print( #It's useless to output state directly. You need to run

At the beginning of the loop, execute At this time, the state will add 1 output, and then continue to execute output 2 and 3 twice.


Continue to add a case,

import tensorflow as tf

# Define variables
a = tf.constant([5, 3], name='input_a')

# calculation
b = tf.reduce_prod(a, name='prod_b')
c = tf.reduce_sum(a, name='sum_c')
d = tf.add(b, c, name='add_d')

# Session
with tf.Session() as sess:

The output result is shown in the figure below. Node a receives a tensor, which flows out of node a to nodes b and c respectively. Node b performs prod operation 5 * 3 and node c performs sum operation 5 + 3. When the tensor flows out from node b, it becomes 15, and when it flows out from node c, it becomes 8. At this time, two tensors flow into node d at the same time. They accept the add operation 15 + 8. Finally, the tensor flowing out of node D is 23.

a: [5 3]
b: 15
c: 8
d: 23

The tensor is the process of flowing in the graph, as shown in the following figure. When we pass a node in the diagram to, we are actually saying to TensorFlow, "Hi, I want the output of this node, please run the corresponding operation to get it, thank you!" at this time, the Session will find all the operations that this node depends on, and then calculate them in order from front to back, Until you get the results you need.

4, placeholder incoming value

Placeholders are called incoming values or placeholders. The above example introduces tensors into the calculation diagram and stores them in the form of constants or variables. Tensorflow also provides another mechanism, that is, define placeholders first, and then fill or update the values of placeholders with specific values when they are actually executed.

TensorFlow uses tf.placeholder() to create a placeholder. First hold the variable, and then pass it in from the outside. Fill in the placeholder value, which is the feed of Session.run_ Dict is the parameter filling value.

# -*- coding: utf-8 -*-
Created on Sun Dec  1 18:21:29 2019
@author: Eastmount CSDN YXZ

import tensorflow as tf

# Pass in value given type
input1 = tf.placeholder(tf.float32)
input2 = tf.placeholder(tf.float32)

# Output multiplication
output = tf.multiply(input1, input2)

# Session
with tf.Session() as sess:
    # placeholder needs to pass in the value, and the dictionary type is passed in during
    print(, feed_dict={input1:[7.], input2:[2.0]})) 

The output results are as follows. If you want to use placeholder, it means that you want to input values when running the results.


5, Excitation function

activation function will make some neurons activate first, and then transfer the activated information to the next layer of nervous system. For example, when some neurons see a picture of a cat, they will be particularly interested in the cat's eyes. When neurons see the cat's eyes, they will be excited and their value will be increased.

The excitation function is equivalent to a filter or exciter, which activates unique information or features. Common activation functions include softplus, sigmoid, relu, softmax, elu, tanh, etc. For hidden layers, we can use nonlinear relations such as relu, tanh and softplus; For the classification problem, we can use sigmoid (the smaller the value is, the closer it is to 0, and the larger the value is, the closer it is to 1) and softmax function to calculate the probability for each class, and finally take the maximum probability as the result; For regression problems, linear function can be used to experiment.

For common excitation functions, please refer to Wikipedia:


TensorFlow is structured as follows. The input value passes through hidden layers layer1 and layer2, and then there is a prediction value. cross_entropy is the difference between the calculated value and the real value.

Open layer2 and you can see the excitation function in it. The value transmitted from layer1 is processed. After processing, layer2 outputs the value Wx_plus_b. The value passes through an excitation function relu, some parts are excited, and then continue to be passed to predictions as the predicted value.

You can also search "TensorFlow activation" on google or baidu. The incentive function is shown in the following figure:

  • api_docs/python/nn.html

A simple example of incentive function is added below. In the future, we will use incentive function to solve practical problems in combination with specific cases.

import tensorflow as tf

a = tf.constant([-1.0, 2.0])

# Excitation function
with tf.Session() as sess:
    b = tf.nn.relu(a)
    c = tf.sigmoid(a)

The output result is:

[0. 2.]
[0.26894143 0.880797  ]

The Sigmoid function is:

This is one of the most commonly used activation functions in traditional neural networks. Its advantage is that its output is mapped in (0,1), monotonic and continuous. It is very suitable to be used as the output layer, and the derivation is relatively easy. The disadvantage is that it has soft saturation. Once the input falls into the saturation region, the first derivative becomes close to 0, which is easy to cause the gradient to disappear.

relu function is the most used and popular activation function at present. The formula and function images are as follows:

It can be seen from the figure that relu is hard saturated when x < 0, because the first derivative is 1 when x > 0. Therefore, when x > 0, the relu function can maintain the gradient without attenuation, so as to alleviate the problem of gradient disappearance and converge faster. However, with the progress of training, some inputs will fall into the hard saturation area, resulting in the corresponding weight can not be updated.

6, Summary

This is the end of this basic TensorFlow article. This is a very basic in-depth learning article. At the same time, there are errors or deficiencies in the article. Please forgive me~

References, thank you for your articles and videos. I recommend you to follow Mr. Mo fan. He is my introductory teacher of artificial intelligence.

  • [1] Introduction to neural networks and machine learning - author's article
  • [2] Stanford machine learning video Professor NG:
  • [3] Book "artificial intelligence in game development"
  • [4] Netease cloud don't bother teacher video (strong push):
  • [5] Neural network excitation function - deep learning
  • [6] tensorflow Architecture - NoMorningstar
  • [7] Tensorflow 2.0 introduction to low level api - GumKey
  • [8] Fundamentals of tensorflow - kk123k
  • [9] Tensorflow basic knowledge sorting - sinat_ thirty-six million one hundred and ninety thousand six hundred and forty-nine
  • [10] Deep learning (II): TensorFlow Basics - the sea of hichri
  • [11] tensorflow basic concept - lusic01
  • [12] tensorflow: activation function - haoji007
  • [13] AI = > tensorflow 2.0 syntax - tensor & basic function (I)

Posted on Thu, 02 Dec 2021 15:25:50 -0500 by scuzzo84