python multiprocess-fork()

Introduction *

Each time a program executes, the operating system creates a new process to run program instructions.Os.fork can be called in the process, requiring the operating system to create a new subprocess. [In Windowsc systems, the OS module does not have an os.fork function].

Each process has a unique process ID number.Or pid, which identifies a process.The child process is identical to the parent process, which inherits copies of multiple values from the parent process.Such as global and environment variables.After fork, the child process receives a return value of 0, while the parent process receives the PID of the child process as the return value

os.fork()

Fork a child process. Return 0 in the child and the child's process id in the parent. If an error occurs OSError is raised.

Note that some platforms including FreeBSD <= 6.3 and Cygwin have known issues when using fork() from a thread.

Availability: Unix > only supports Unix-based core systems

python documentation

(1) The fork function returns 0 in the child process and the id of the child process in the parent process:.

  • os.getpid() returns process pid

  • os.getppid() returns the parent process pid

# -*-coding:utf-8-*-
import os
import time
print('before calling')

p = os.fork()  # Main process, child process executing down at the same time

print('after calling')

if p == 0:
    print('Execute subprocess, pid={} ppid={} p={}'.format(os.getpid(), os.getppid(), p))
else:
    print('Execute main process, pid={} ppid={} p={}'.format(os.getpid(), os.getppid(), p))
[root@192 ~]# python fork.py 
before calling
after calling
Execute main process, pid=1629 P pid=1572 p=1630
after calling
Execute subprocess, pid=1630 P pid=1629 p=0

Conclusion: After calling os.fork(), the main process and the child process execute the code below this line simultaneously. The fork function in the child process returns 0, and the parent process returns 1630, which is the pid of the child process.

Look again at the following code results:

# -*-coding:utf-8-*-
import os
import time
print('before calling')

p = os.fork()  # Main process, child process executing down at the same time

print('after calling')

if p == 0:
    print('Execute subprocess, pid={} ppid={} p={}'.format(os.getpid(), os.getppid(), p))
    time.sleep(1)
    print('Execute subprocess, pid={} ppid={} p={}'.format(os.getpid(), os.getppid(), p))
else:
    print('Execute main process, pid={} ppid={} p={}'.format(os.getpid(), os.getppid(), p))
[root@192 ~]# python fork.py 
before calling
after calling
Execute main process, pid=1648 P pid=1572 p=1649
after calling
Execute subprocess, pid=1649 P pid=1648 p=0
[root@192 ~]#Execute subprocess, pid=1649 P pid=1 p=0

The sub-process prints a piece of information first, then sleeps for one second, then prints a piece of information. Of the two messages output by the sub-process, the ppids are 2513 and 1, respectively.- Question 1: How has ppid changed?

Next, we will discuss:

(2) fork() starts a process and does not wait for a child process after the main process has finished executing:.

Practice: Execute the following code behind the scenes to sleep the main process for five seconds and the child process for ten seconds:

# -*-coding:utf-8-*-

import os
import time

p = os.fork()

if p == 0:
    time.sleep(10)
    print('Execute subprocess, pid={} ppid={} p={}'.format(os.getpid(), os.getppid(), p))
else:
    time.sleep(5)
    print('Execute main process, pid={} ppid={} p={}'.format(os.getpid(), os.getppid(), p))
[root@192 ~]# python fork.py &   ### Background execution of python code
[1] 1693

### Five seconds ago, view the process information: the main process is 1693, and the child process is 1694

[root@192 ~]# ps aux | grep fork.py
root       1693  0.0  0.1 125432  4592 pts/0    S    21:23   0:00 python fork.py
root       1694  0.0  0.0 125432  2748 pts/0    S    21:23   0:00 python fork.py
root       1696  0.0  0.0 112704   980 pts/0    S+   21:23   0:00 grep --color=auto fork.py
[root@192 ~]# Execute the main process, pid=1693 P pid=1572 p=1694 (this prints information for the program indicating that the main process has finished executing)

[1]+  Done                    python fork.py

### Five seconds later, the main process finishes executing and reviews the process information: only child processes remain 1694

[root@192 ~]# ps aux | grep fork.py
root       1694  0.0  0.0 125432  2748 pts/0    S    21:23   0:00 python fork.py
root       1698  0.0  0.0 112704   980 pts/0    S+   21:23   0:00 grep --color=auto fork.py
[root@192 ~]# Executes a subprocess, pid=1694 P pid=1 p=0 (this prints information for the program indicating that the subprocess has finished executing).* Note that ppid here is 1)

### Ten seconds later, the subprocess finishes executing and the subprocess finishes

[root@192 ~]# ps aux | grep fork.py
root       1708  0.0  0.0 112704   980 pts/0    S+   21:23   0:00 grep --color=auto fork.py

Phenomenon: Five seconds ago, both processes were executing. After five seconds, the main process ended with only one child process (indicating that the parent process did not wait for the child process). Ten seconds later, the child process ended

Conclusion: Parent process does not wait for child process after execution

Interpretation of Question 1: When the child process first prints, it happens that the parent process has not finished yet and the ppid of the parent process can also be obtained. Therefore, the first printed ppid is the pid of the parent process. After one second of sleep, the parent process executes early and runs away without waiting for the child process. Therefore, the child process is given to the init process and the ppid becomes 1.

(3) Zombie Processes:

 

If the child process ends before the parent process and the parent process does not recycle the child process, freeing up the resources occupied by the child process, the child process will become a zombie process.

 

What's the harm? If a large number of zombie processes are generated, the system will not be able to generate new processes because no process number is available. This is the harm of zombie processes and should be avoided.

Avoidance of zombie processes:

  1. The parent process waits for the child process to finish through functions such as os.wait() and os.waitpid, which causes the parent process to hang.
  2. If the parent process is busy, you can install a handler for SIGCHLD using the signal function, because the parent process receives the signal when the child process ends and wait recycling can be called in the handler.
  3. If the parent process does not care when the child process ends, then the kernel can be notified with signal (SIGCHLD,SIG_IGN) that it is not interested in the end of the child process. When the child process ends, the kernel recycles and no longer signals the parent process.
  4. Another trick is fork twice, the parent process fork a child process, and then continue to work. The child process fork exits after a grandchild process, then the grandchild process is taken over by init and the init is recycled after the grandchild process ends.However, the child process has to do its own recycling.

The child process becomes a zombie process because the parent process executes first and does not collect the body for the child process.While wait() is not used to collect corpses, it just prevents the parent process from exiting before the child process; if the parent process exits first, it makes the child process a zombie process, at which point the child process is recycled by the init process number 1.

The main process waits for the child process to end by calling os.wait():

# -*-coding:utf-8-*-

import os
import time

p = os.fork()

if p == 0:
    time.sleep(10)
    print('Execute subprocess, pid={} ppid={} p={}'.format(os.getpid(), os.getppid(), p))
else:
    time.sleep(5)
    print('Execute main process, pid={} ppid={} p={}'.format(os.getpid(), os.getppid(), p))
    os.wait()
[root@192 ~]# python fork.py &   # Background execution of python code

### Five seconds ago, view process information: the main process is 1751, and the child process is 1752

[1] 1751
[root@192 ~]# ps aux | grep fork.py
root       1751  0.5  0.1 125432  4588 pts/0    S    21:29   0:00 python fork.py
root       1752  0.0  0.0 125432  2748 pts/0    S    21:29   0:00 python fork.py
root       1754  0.0  0.0 112704   980 pts/0    S+   21:29   0:00 grep --color=auto fork.py
[root@192 ~]# Executes the main process, pid=1751 P pid=1572 p=1752 (this prints information for the program indicating that the main process has executed to os.wait())

### Five seconds later, the main program prints the information and calls os.wait() to see the process information: the main process is 1751, the child process is 1752, and the main process does not end

[root@192 ~]# ps aux | grep fork.py
root       1751  0.1  0.1 125436  4588 pts/0    S    21:29   0:00 python fork.py
root       1752  0.0  0.0 125432  2748 pts/0    S    21:29   0:00 python fork.py
root       1756  0.0  0.0 112704   980 pts/0    S+   21:29   0:00 grep --color=auto fork.py
[root@192 ~]# Executes a subprocess, pid=1752 P pid=1751 p=0 (this prints information for the program indicating that the subprocess has finished executing).*Note that ppid is not 1 here)

[1]+  Done                    python fork.py

### Ten seconds later, the child process finishes execution, the child process finishes, and the parent process finishes with the child process finishes

[root@192 ~]# ps aux | grep fork.py
root       1758  0.0  0.0 112704   980 pts/0    S+   21:29   0:00 grep --color=auto fork.py

Phenomenon: Five seconds ago, both processes were executing. After five seconds, the main process completed execution and called os.wait(), waiting for the child process to finish. Ten seconds later, the child process ended and the parent process ended

Conclusion: The parent process can call os.wait() to wait for the child process to finish.*Calling os.wait() without a child process throws an exception: OSError: [Errno 10] No child processes

(4) Independent inter-process resources:

Practice: Define a variable before the fork() function, modify the value in a subprocess, and then see if the variable has been modified in the main process:

# -*-coding:utf-8-*-

import os
import time

variable = []
p = os.fork()
if p == 0:
    variable.append(1)
    print('Subprocess variable_id={}'.format(id(variable)))
    print('Subprocess variable={}'.format(variable))
else:
    time.sleep(1)  # Sleep for a second and let the child process change the value of the variable first
    print('Main Process variable_id={}'.format(id(variable)))
    print('Main Process variable={}'.format(variable))
    os.wait()
[root@192 ~]# python fork.py 
Subprocess variable_id=140426199897224
Subprocess variable=[1]
Main process variable_id=140426199897224
Main process variable=[]

Conclusion: The values of variables are changed in the child process but not in the parent process, indicating that global variables are not shared among processes

Q2: But why is the variable id the same?

Interpretation of Question 2: Write-time replication: The kernel only creates virtual space structures for newly generated child processes, which are replicated in the parent process's virtual space structure, but no physical memory is allocated to these segments, which share the parent process's physical space and are replaced by the child when changes to the corresponding segments occur in the parent-child processThe corresponding segment of the process allocates physical space.So the virtual addresses are the same, and the id values of the viewing variables are the same for both processes, regardless of whether or not the child process modifies them

 

Reference link:

Baidu Encyclopedia Zombie Process

Explanation of python's os module fork, wait, system, exec, popen, exit functions

What exactly did the parent process do to the child process when it called the wait function?

Linux process management - fork() and write-time replication

 

If you have any comments or suggestions, communicate with each other; if there is any infringement, please inform to delete.

Tags: Python Unix Cygwin Linux

Posted on Sun, 15 Sep 2019 20:42:01 -0400 by Christopher