Python full stack advanced programming skills

Article directory

1, Generator send method

1. Synchronous and asynchronous

  • Synchronization:
    It refers to the calling mode that the code must wait for IO operation to complete before returning when calling IO operation.
  • Asynchronous:
    It refers to the calling method that the code does not need to wait for IO operation to complete when calling IO operation.
    The synchronous and asynchronous comparison is as follows:

2. Blocked, non blocked

  • Obstruction:
    From the caller's point of view, if the call is stuck and cannot continue to run downward, and needs to wait, it is called blocking.
    Examples of clogging are:
    • Database and lock mechanism operated by multiple users at the same time
    • Socket's accept() method
    • input()
  • Non blocking:
    From the perspective of the caller, if the call is not stuck, it can continue to run downward without waiting, it is non blocking.

3. send() method of generator

We talked about generators before:

def create_fib(num):
    a, b = 0, 1
    current_num = 0
    while current_num < num:
        yield a
        a, b = b, a + b
        current_num += 1


g = create_fib(5)
print(next(g))
print()
for i in g:
    print(i)

Printing

0

1
1
2
3

If there is a return value in the generator, to get the return value:

def create_fib(num):
    a, b = 0, 1
    current_num = 0
    while current_num < num:
        yield a
        a, b = b, a + b
        current_num += 1
    return 'hello'


g = create_fib(5)


while True:
    try:
        ret = next(g)
        print(ret)
    except Exception as e:
        print(e.args[0])
        break

Printing

0
1
1
2
3
hello

Obviously, hello gets it in the exception statement of exception handling and prints it out.
The send() method has a parameter that specifies the return value of the last pending yield statement.
Start the generator with the send() method:

def create_fib(num):
    a, b = 0, 1
    current_num = 0
    while current_num < num:
        yield a
        a, b = b, a + b
        current_num += 1
    return 'hello'


g = create_fib(5)


print(g.send(None))
print(g.send('hello'))

Printing

0
1

When the send() method is called for the first time, the parameter passed in can only be None, otherwise an error will be reported:

def create_fib(num):
    a, b = 0, 1
    current_num = 0
    while current_num < num:
        yield a
        a, b = b, a + b
        current_num += 1
    return 'hello'


g = create_fib(5)


print(g.send('hello'))
print(g.send('world'))

Printing

Traceback (most recent call last):
  File "xxx/demo.py", line 14, in <module>
    print(g.send('hello'))
TypeError: can't send non-None value to a just-started generator

Obviously, the first call to send() must pass in None, that is, when the first call is not next(), the parameter to send() must be None.
The send() method can be used in combination with the next() method:

def create_fib(num):
    a, b = 0, 1
    current_num = 0
    while current_num < num:
        yield a
        a, b = b, a + b
        current_num += 1
    return 'hello'


g = create_fib(5)


print(next(g))
print(g.send('hello'))

Printing

0
1

When yield a is assigned:

def create_fib(num):
    a, b = 0, 1
    current_num = 0
    while current_num < num:
        result = yield a
        print(result)
        a, b = b, a + b
        current_num += 1
    return 'hello'


g = create_fib(5)


print(next(g))
print(g.send('hello'))

Printing

0
hello
1

Print out hello and verify the printing of Hello:

def create_fib(num):
    a, b = 0, 1
    current_num = 0
    while current_num < num:
        result = yield a
        print('result-->', result)
        a, b = b, a + b
        current_num += 1
    return 'hello'


g = create_fib(5)


print(next(g))
print(g.send('hello'))

Printing

0
result--> hello
1

Obviously, it is printed by printing the result.
Interpretation:
When result = yield a is executed, stop here, execute yield a to return to print out 0. When the send() method is called, assign hello to the whole yield a, that is, assign it to result, continue to execute downward, and print out 1 in the second cycle.
Retest:

def create_fib(num):
    a, b = 0, 1
    current_num = 0
    while current_num < num:
        result = yield a
        print('result-->', result)
        a, b = b, a + b
        current_num += 1
    return 'hello'


g = create_fib(5)


print(next(g))
print(g.send('hello'))
print(g.send('world'))

Printing

0
result--> hello
1
result--> world
1

The close() method of the generator uses:

def create_fib(num):
    a, b = 0, 1
    current_num = 0
    while current_num < num:
        result = yield a
        print('result-->', result)
        a, b = b, a + b
        current_num += 1
    return 'hello'


g = create_fib(5)


print(next(g))
# Close generator
g.close()
print(g.send('hello'))
print(g.send('world'))

Printing

0
Traceback (most recent call last):
  File "xxx/demo.py", line 18, in <module>
    print(g.send('hello'))
StopIteration

That is, after the close() method is called, the generator is closed and the iteration stops, and the next() and send() methods cannot be called.

2, Multitasking and yield from with yield

1. Use yield to complete multiple tasks

Use yield to implement multitask testing:

import time


def task1():
    while True:
        print('---1---')
        time.sleep(0.1)
        yield


def task2():
    while True:
        print('---2---')
        time.sleep(0.1)
        yield


def main():
    t1 = task1()
    t2 = task2()
    while True:
        next(t1)
        next(t2)


if __name__ == '__main__':
    main()

Show:

It realizes the effect of running alternately and multitasking, and consumes less resources than threads and processes.

2. Use of yield from

itertools.chain can output results to multiple iteratable objects:

from itertools import chain


lis = [1, 2, 3]
dic = {
    'name':'Corley',
    'age':18
}

for value in chain(lis, dic, range(5,10)):
    print(value)

Printing

1
2
3
name
age
5
6
7
8
9

And the itertools.chain object can be forced to convert to a list:

from itertools import chain


lis = [1, 2, 3]
dic = {
    'name':'Corley',
    'age':18
}

print(list(chain(lis, dic, range(5,10))))
for value in chain(lis, dic, range(5,10)):
    print(value)

Printing

[1, 2, 3, 'name', 'age', 5, 6, 7, 8, 9]
1
2
3
name
age
5
6
7
8
9

You can use yield to do the same:

lis = [1, 2, 3]
dic = {
    'name':'Corley',
    'age':18
}


def my_chain(*args, **kwargs):
    for my_iterable in args:
        for value in my_iterable:
            yield value


for value in my_chain(lis, dic, range(5,10)):
    print(value)

Printing

1
2
3
name
age
5
6
7
8
9

Python 3.3 added the yield from syntax.
Use yield from to achieve the same effect:

lis = [1, 2, 3]
dic = {
    'name':'Corley',
    'age':18
}


def my_chain(*args, **kwargs):
    for my_iterable in args:
        yield from my_iterable


for value in my_chain(lis, dic, range(5,10)):
    print(value)

The execution result is the same as the former, that is, yield from is equivalent to a for loop.
Yield and yield from comparison:

def generator1(lis):
    yield lis


def generator2(lis):
    yield from lis


lis = [1, 2, 3, 4, 5]

for i in generator1(lis):
    print(i)

for i in generator2(lis):
    print(i)

Printing

[1, 2, 3, 4, 5]
1
2
3
4
5

Sum the parameters passed in the generator:

# Child generator
def generator_1():
    total = 0
    while True:
        x = yield
        print('add --', x)
        if not x:
            break
        total += x
    return total


# Delegate generator
def generator_2():
    while True:
        total = yield from generator_1()  # Child generator
        print('sum is --', total)


# Caller
def main():
    g1 = generator_1()
    g1.send(None)
    g1.send(2)
    g1.send(3)
    g1.send(None)


if __name__ == '__main__':
    main()

Printing

Traceback (most recent call last):
add -- 2
add -- 3
add -- None
  File "xxx/demo.py", line 121, in <module>
    main()
  File "xxx/demo.py", line 112, in main
    g1.send(None)
StopIteration: 5

That is to say, the function cannot be realized through generator 1.
Use generator? 2 to try:

# Child generator
def generator_1():
    total = 0
    while True:
        x = yield
        print('add --', x)
        if not x:
            break
        total += x
    return total


# Delegate generator
def generator_2():
    while True:
        # yield from establish the channel of caller and sub generator
        total = yield from generator_1()  # Child generator
        print('sum is --', total)


# Caller
def main():
    g2 = generator_2()
    g2.send(None)
    g2.send(2)
    g2.send(3)
    g2.send(None)


if __name__ == '__main__':
    main()

Printing

add -- 2
add -- 3
add -- None
sum is -- 5

The function is realized.
Description and interpretation:
Sub generator: the generator 1() generator function after yield from is a sub generator;
Delegate generator: generator_2() is the delegate generator in the program, which is responsible for delegate sub generator to complete specific tasks;
Caller: main() is the caller in the program and is responsible for calling the delegate generator.
Yield from establishes the channel between the caller and the sub generator. With the help of the delegate generator, the value passed by the send() function is passed to the sub generator through yield from;
yield from saves a lot of exception handling.

3, Multitasking with Greenlet & gevent

1. Concept of cooperation

A coroutine, also known as a micro thread, is another way to implement multitasking in Python. It is only smaller than a thread and occupies smaller execution units (understood as the required resources).
The orchestration in Python has roughly gone through the following three stages:

  • (1) The original generator deformation yield/send
  • (2)yield from
  • (3) Introducing the async/await keyword in the latest Python version 3.5

The cooperation program has its own CPU context. Only by saving the running state through yield can the CPU context program be recovered.

2. Use greenlet to complete multiple tasks

Installation module:

pip install greenlet

greenlet uses:

from greenlet import greenlet
import time


def demo1():
    while True:
        print('---demo1---')
        gr2.switch()
        time.sleep(0.5)


def demo2():
    while True:
        print('---demo2---')
        gr1.switch()
        time.sleep(0.5)


gr1 = greenlet(demo1)
gr2 = greenlet(demo2)
gr1.switch()

Show:

It is easy to know that the cooperation process uses the IO of the program to switch tasks, and the greenlet module requires manual switching.

3. Use gevent to accomplish multiple tasks

Installation module:

pip install gevent

Try using gevent:

import gevent
import time


def f1(n):
    for i in range(n):
        print(gevent.getcurrent(), i)
        time.sleep(0.5)


def f2(n):
    for i in range(n):
        print(gevent.getcurrent(), i)
        time.sleep(0.5)


def f3(n):
    for i in range(n):
        print(gevent.getcurrent(), i)
        time.sleep(0.5)


g1 = gevent.spawn(f1, 5)
g2 = gevent.spawn(f2, 5)
g3 = gevent.spawn(f3, 5)

g1.join()
g2.join()
g3.join()

Printing

<Greenlet at 0x25f41185378: f1(5)> 0
<Greenlet at 0x25f41185378: f1(5)> 1
<Greenlet at 0x25f41185378: f1(5)> 2
<Greenlet at 0x25f41185378: f1(5)> 3
<Greenlet at 0x25f41185378: f1(5)> 4
<Greenlet at 0x25f41185598: f2(5)> 0
<Greenlet at 0x25f41185598: f2(5)> 1
<Greenlet at 0x25f41185598: f2(5)> 2
<Greenlet at 0x25f41185598: f2(5)> 3
<Greenlet at 0x25f41185598: f2(5)> 4
<Greenlet at 0x25f411856a8: f3(5)> 0
<Greenlet at 0x25f411856a8: f3(5)> 1
<Greenlet at 0x25f411856a8: f3(5)> 2
<Greenlet at 0x25f411856a8: f3(5)> 3
<Greenlet at 0x25f411856a8: f3(5)> 4

Obviously, it does not achieve the expected effect of multitasking.
Make improvements – using gevent.sleep():

import gevent
import time


def f1(n):
    for i in range(n):
        print(gevent.getcurrent(), i)
        gevent.sleep(0.5)


def f2(n):
    for i in range(n):
        print(gevent.getcurrent(), i)
        gevent.sleep(0.5)


def f3(n):
    for i in range(n):
        print(gevent.getcurrent(), i)
        gevent.sleep(0.5)


g1 = gevent.spawn(f1, 5)
g2 = gevent.spawn(f2, 5)
g3 = gevent.spawn(f3, 5)

g1.join()
g2.join()
g3.join()

Printing

<Greenlet at 0x2518fda5378: f1(5)> 0
<Greenlet at 0x2518fda5598: f2(5)> 0
<Greenlet at 0x2518fda56a8: f3(5)> 0
<Greenlet at 0x2518fda5378: f1(5)> 1
<Greenlet at 0x2518fda5598: f2(5)> 1
<Greenlet at 0x2518fda56a8: f3(5)> 1
<Greenlet at 0x2518fda5378: f1(5)> 2
<Greenlet at 0x2518fda5598: f2(5)> 2
<Greenlet at 0x2518fda56a8: f3(5)> 2
<Greenlet at 0x2518fda5378: f1(5)> 3
<Greenlet at 0x2518fda5598: f2(5)> 3
<Greenlet at 0x2518fda56a8: f3(5)> 3
<Greenlet at 0x2518fda5378: f1(5)> 4
<Greenlet at 0x2518fda5598: f2(5)> 4
<Greenlet at 0x2518fda56a8: f3(5)> 4

At this time, multitasking is realized.
Retest – if time.sleep(2):

import gevent
import time


def f1(n):
    for i in range(n):
        print(gevent.getcurrent(), i)
        gevent.sleep(0.5)


def f2(n):
    for i in range(n):
        print(gevent.getcurrent(), i)
        gevent.sleep(0.5)


def f3(n):
    for i in range(n):
        print(gevent.getcurrent(), i)
        gevent.sleep(0.5)

print('--1--')
g1 = gevent.spawn(f1, 5)
print('--2--')
time.sleep(2)
g2 = gevent.spawn(f2, 5)
print('--3--')
g3 = gevent.spawn(f3, 5)
print('--4--')

g1.join()
g2.join()
g3.join()

Show:

Obviously, time.sleep() doesn't affect gevent's operation, and it doesn't start until after sleep().
Change to gevent.sleep() and the effect will be different:

import gevent
import time


def f1(n):
    for i in range(n):
        print(gevent.getcurrent(), i)
        gevent.sleep(0.5)


def f2(n):
    for i in range(n):
        print(gevent.getcurrent(), i)
        gevent.sleep(0.5)


def f3(n):
    for i in range(n):
        print(gevent.getcurrent(), i)
        gevent.sleep(0.5)

print('--1--')
g1 = gevent.spawn(f1, 5)
print('--2--')
gevent.sleep(2)
g2 = gevent.spawn(f2, 5)
print('--3--')
g3 = gevent.spawn(f3, 5)
print('--4--')

g1.join()
g2.join()
g3.join()

Show:

That is to say, the execution of the cooperation process is affected. In practice, the time-consuming IO operation will be used instead of gevent.sleep().
If there are a large number of time-consuming operation codes such as time.sleep(), you can use the classes in the module to implement them instead of manually changing them to gevent.sleep():

import gevent
import time
from  gevent import monkey


# Transform the time-consuming operations used in the program into modules implemented in gevent
monkey.patch_all() # Equivalent to patching


def f1(n):
    for i in range(n):
        print(gevent.getcurrent(), i)
        time.sleep(0.5)


def f2(n):
    for i in range(n):
        print(gevent.getcurrent(), i)
        time.sleep(0.5)


def f3(n):
    for i in range(n):
        print(gevent.getcurrent(), i)
        time.sleep(0.5)

print('--1--')
g1 = gevent.spawn(f1, 5)
print('--2--')
time.sleep(2)
g2 = gevent.spawn(f2, 5)
print('--3--')
g3 = gevent.spawn(f3, 5)
print('--4--')

g1.join()
g2.join()
g3.join()

Show:

The same effect can be achieved.

4.gevent simple application

import gevent
from gevent import monkey
monkey.patch_all()
import requests


def download(url):
    print('to get:%s' % url)
    res = requests.get(url)
    data = res.text
    print('Got:', len(data), url)


g1 = gevent.spawn(download, 'http://www.baidu.com')
g2 = gevent.spawn(download, 'https://www.csdn.net/')
g3 = gevent.spawn(download, 'https://stackoverflow.com')

g1.join()
g2.join()
g3.join()

Show:

import requests must be after monkey. Patch ENU all(), otherwise there will be a warning message.
Further simplify the code:

import gevent
from gevent import monkey
monkey.patch_all()
import requests


def download(url):
    print('to get:%s' % url)
    res = requests.get(url)
    data = res.text
    print('Got:', len(data), url)


gevent.joinall([
    gevent.spawn(download, 'http://www.baidu.com'),
    gevent.spawn(download, 'https://www.csdn.net/'),
    gevent.spawn(download, 'https://stackoverflow.com')
])

The execution result is the same as before.
Yes, the process is concurrent, because it belongs to single thread to complete multiple tasks.

5. Process, thread and collaboration comparison

  • Process is the unit of resource allocation;
  • Thread is the unit of operating system scheduling;
  • Process switching needs a lot of resources and low efficiency;
  • The resources and efficiency needed for thread switching are general (without considering GIL);
  • The resource of the task is very small and the efficiency is high;
  • Multiprocesses and multithreads may be parallel according to different CPU cores, but the process is in one thread, so it is concurrent.
72 original articles published, praised 336, visited 90000+
Private letter follow

Tags: Python pip Database socket

Posted on Fri, 07 Feb 2020 07:42:27 -0500 by giannis_athens