Python syntax: multithreading and multiprocessing

1, Basic concepts

thread

  • Thread is the smallest unit that the operating system can schedule operations. It is included in the process and is the actual operation unit in the process.
  • A thread refers to a single sequential control flow in a process. Multiple threads can be concurrent in a process, and each thread executes different tasks in parallel.
  • A thread is an execution context, that is, a string of instructions required by a cpu for execution.
  • How threads work:

    Suppose you are reading a book and haven't finished it. You want to take a break, but you want to return to the specific progress when you come back. One way is to write down the number of pages, lines and words. These values are execution context. If your roommate reads this book in the same way when you are resting. You and she only need to write down these three numbers to read the book together at alternate times.

    Threads work similarly. The CPU will give you the illusion that you can do multiple operations at the same time. In fact, it only spends very little time on each operation. In essence, the CPU does only one thing at the same time. It can do this because it has an execution context for each operation. Just as you can share the same book with your friends, multitasking can share the same CPU.

process

  • The execution instance of a program is a process. A process is essentially a collection of resources.
  • A process has a virtual address space, executable code, operating system interface, security context (recording the user and permission to start the process, etc.), unique process ID, environment variables, priority class, minimum and maximum workspace (memory space), and at least one thread.
  • When each process starts, it will first produce a thread, that is, the main thread. The main thread then creates other child threads.

Process related resources include:

  • Memory page (all threads in the same process share one)
  • File descriptor
  • Security Credential

Difference between process and thread:

  • Threads in the same process share the same memory space, but processes are independent.
  • The data of all threads in the same process is shared (process communication), the data between processes is independent, and the communication between processes needs to be realized with the help of intermediate agents.
  • Modifications to the main thread may affect the behavior of other threads, but modifications to the parent process (except deletion) will not affect other child processes.
  • A thread is an instruction executed in a context, and a process is a cluster of resources related to operations.
  • Creating a new thread is easy, but creating a new process requires a copy of the parent process.
  • A thread can operate other threads of the same process, but a process can only operate its child processes.
  • The thread startup speed is fast and the process startup speed is slow (but the running speed of the two is not comparable).

Multithreading

Common thread methods

methodnotes
start()The thread is ready for CPU scheduling
setName()Set the name for the thread
getName()Get thread name
setDaemon(True)Set as daemon thread
join()Execute each thread one by one, and continue to execute after execution
run()After the thread is scheduled by the cpu, it automatically executes the run method of the thread object. If you want to customize the thread class, you can directly override the run method

Thread class

import threading
import time

def run(n):
    print("task", n)
    time.sleep(1)
    print('2s')
    time.sleep(1)
    print('1s')
    time.sleep(1)
    print('0s')
    time.sleep(1)

t1 = threading.Thread(target=run, args=("t1",))
t2 = threading.Thread(target=run, args=("t2",))
t1.start()
t2.start()

Output:

"""
task t1
task t2
2s
2s
1s
1s
0s
0s
"""

Inherit threading.Thread to customize thread class

Its essence is to refactor the run method in the Thread class

import threading
import time


class MyThread(threading.Thread):
    def __init__(self, n):
        super(MyThread, self).__init__()  # To refactor the run function, you must write
        self.n = n

    def run(self):
        print("task", self.n)
        time.sleep(1)
        print('2s')
        time.sleep(1)
        print('1s')
        time.sleep(1)
        print('0s')
        time.sleep(1)


if __name__ == "__main__":
    t1 = MyThread("t1")
    t2 = MyThread("t2")

    t1.start()
    t2.start()

Calculate the execution time of the child thread

Note: the cpu will not be occupied during sleep. The operating system will suspend the thread temporarily during sleep.

join()  #After this thread is executed, other threads or main threads can be executed
threading.current_thread()      #Output current thread
import threading
import time

def run(n):
    print("task", n,threading.current_thread())    #Output current thread
    time.sleep(1)
    print('3s')
    time.sleep(1)
    print('2s')
    time.sleep(1)
    print('1s')

strat_time = time.time()

t_obj = []   #The definition list is used to store child thread instances

for i in range(3):
    t = threading.Thread(target=run, args=("t-%s" % i,))
    t.start()
    t_obj.append(t)
    
"""
Three sub threads generated by the main thread
task t-0 <Thread(Thread-1, started 44828)>
task t-1 <Thread(Thread-2, started 42804)>
task t-2 <Thread(Thread-3, started 41384)>
"""

for tmp in t_obj:
    t.join()            #After adding a join for each child thread, the main thread will wait for these child threads to execute.

print("cost:", time.time() - strat_time) #Main thread

print(threading.current_thread())       #Output current thread
"""
<_MainThread(MainThread, started 43740)>
"""

Count the number of currently active threads

When the main thread is much faster than the child thread, when the main thread executes active_ When count(), other sub threads have not finished executing, so the number of active threads counted by the main thread is num = sub_ Num (number of sub threads) + 1 (main thread itself)

import threading
import time

def run(n):
    print("task", n)    
    time.sleep(1)       #At this time, the sub thread stops for 1s

for i in range(3):
    t = threading.Thread(target=run, args=("t-%s" % i,))
    t.start()

time.sleep(0.5)     #The main thread stops for 0.5 seconds
print(threading.active_count()) #Outputs the number of currently active threads

"""
task t-0
task t-1
task t-2
4
"""

When the main thread is much slower than the child thread, when the main thread executes active_ When count(), other child threads have been executed. Therefore, the number of active threads counted by the main thread is num = 1 (the main thread itself)

import threading
import time


def run(n):
    print("task", n)
    time.sleep(0.5)       #At this time, the sub thread stops for 0.5s


for i in range(3):
    t = threading.Thread(target=run, args=("t-%s" % i,))
    t.start()

time.sleep(1)     #The main thread stops for 1 second
print(threading.active_count()) #Output active threads
"""
task t-0
task t-1
task t-2
1
"""

Daemon

Use setDaemon(True) to turn all child threads into the daemon of the main thread, so when the main process ends, the child thread will also end. So when the main thread ends, the whole program exits.

import threading
import time

def run(n):
    print("task", n)
    time.sleep(1)       #At this time, the sub thread stops for 1s
    print('3')
    time.sleep(1)
    print('2')
    time.sleep(1)
    print('1')

for i in range(3):
    t = threading.Thread(target=run, args=("t-%s" % i,))
    t.setDaemon(True)   #To set a child process as a daemon thread, it must be set before start()
    t.start()

time.sleep(0.5)     #The main thread stops for 0.5 seconds
print(threading.active_count()) #Output active threads
"""
task t-0
task t-1
task t-2
4

Process finished with exit code 0
"""

Tags: Python Multithreading

Posted on Fri, 22 Oct 2021 01:12:13 -0400 by onthespot