Python notes: Chapter 1 of Python 3-cookbook (data structure and algorithm)

Each section of Python 3-cookbook discusses the optimal solution of Python 3 in a certain kind of problem by three parts: problem, solution and discussion, or how to better use the data structure, function, class and other features of Python 3 in a certain kind of problem. This book is of great help for deepening the use of Python 3 and improving python programming ability, especially for how to improve the performance of Python programs. If you have time, it is strongly recommended to read it.

For the purpose of learning notes, the content of this article is only part of the book according to their own work needs and usual use, and most of the sample codes in this article are directly pasted with the original code, of course, the code has been verified in the environment of Python 3.6. If you are interested, please read the full text.

python3-cookbook: https://python3-cookbook.readthedocs.io/zh_CN/latest/index.html

 

1.2 decompress iteratable objects and assign them to multiple variables

This section is mainly about the discussion of the asterisk expression. When you want to assign some elements of an iterative object to a variable, especially when the number of these elements is uncertain, the asterisk expression can be used well.

Use scenario 1: ordinary split assignment. It should be noted that the variable of asterisk expression is of list type, even if there are 0 elements in the variable.

>>> record = ('Dave', 'dave@example.com', '773-555-1212', '847-555-1212')
>>> name, email, *phone_numbers = record
>>> name
'Dave'
>>> email
'dave@example.com'
>>> phone_numbers
['773-555-1212', '847-555-1212']
>>> name, email, phone_number1, phone_number2, *others = record
>>> others
[]
>>> 

 

Scenario 2: when the iteration element is a variable length sequence, the asterisk expression is also easy to use.

records = [
    ('foo', 1, 2),
    ('bar', 'hello'),
    ('foo', 3, 4),
]


def do_foo(x, y):
    print('foo', x, y)


def do_bar(s):
    print('bar', s)


for tag, *args in records:
    if tag == 'foo':
        do_foo(*args)
    elif tag == 'bar':
        do_bar(*args)

 

 

1.3 retain the last N elements

In iterative operations or other operations, if you want to keep an object with a fixed number of elements, you can consider using collections.deque, which will construct a fixed length queue (if specified). When the queue is full, if you continue to add elements to the queue, the oldest elements will be deleted and new elements will be added. Deque also has corresponding methods to add and pop elements at both ends of the queue. Although you can use lists and other methods to achieve the same functions, but the performance of deque is better than the insertion and deletion of lists and other operations. Of course, the specific scene depends on personal habits and choices. This section only gives another solution worth considering.

>>> from collections import deque
>>> q = deque(maxlen=3)
>>> q.append(1)
>>> q.append(2)
>>> q.append(3)
>>> q
deque([1, 2, 3], maxlen=3)
>>> q.append(4)
>>> q
deque([2, 3, 4], maxlen=3)
>>> q.appendleft(5)
>>> q
deque([5, 2, 3], maxlen=3)
>>> q.pop()
3
>>> q
deque([5, 2], maxlen=3)
>>> q.popleft()
5
>>> q
deque([2], maxlen=3)
>>> 

 

 

1.4 find the largest or smallest N elements

To find the largest or smallest N elements in a collection with a large number of elements, and the number of elements to be found is relatively small compared with the number of elements in the collection itself, consider using the nlargest and nsmallest functions in the heapq module. It should be noted that if the number of elements to be found is similar to the number of elements in the collection itself, it is recommended to use the method of sorting before cutting, such as sorted(items)[:N].

In heapq, because it is a heap structure, the first element is always the smallest. When you want to use the collection many times and get the smallest element of the collection every time you use the collection, you can consider using the heapq module. Of course, if you just want to get the only largest or smallest element in the collection, it is recommended to directly use the max or min function.

>>> import heapq
>>> nums = [1, 8, 2, 23, 7, -4, 18, 23, 42, 37, 2]
>>> heapq.nlargest(3, nums)
[42, 37, 23]
>>> heapq.nsmallest(3, nums)
[-4, 1, 2]
>>> portfolio = [
    {'name': 'IBM', 'shares': 100, 'price': 91.1},
    {'name': 'AAPL', 'shares': 50, 'price': 543.22},
    {'name': 'FB', 'shares': 200, 'price': 21.09},
    {'name': 'HPQ', 'shares': 35, 'price': 31.75},
    {'name': 'YHOO', 'shares': 45, 'price': 16.35},
    {'name': 'ACME', 'shares': 75, 'price': 115.65}
]
>>> heapq.nlargest(3, portfolio, key=lambda s: s['price'])
[{'name': 'AAPL', 'shares': 50, 'price': 543.22}, {'name': 'ACME', 'shares': 75, 'price': 115.65}, {'name': 'IBM', 'shares': 100, 'price': 91.1}]
>>> 
>>> heap = list(nums)
>>> heapq.heapify(heap)  # Put the data in the list after heap sorting
>>> heap
[-4, 2, 1, 23, 7, 2, 18, 23, 42, 37, 8]
>>> heapq.heappop(heap)
-4
>>> heapq.heappop(heap)
1
>>> heapq.heappop(heap)
2
>>> 

 

 

1.8 dictionary operation

If you want the maximum (minimum) or maximum (minimum) key value pair in a dictionary, you can use the zip() function to reverse the key value first.

>>> prices = {
    'ACME': 45.23,
    'AAPL': 612.78,
    'IBM': 205.55,
    'HPQ': 37.20,
    'FB': 10.75
}
>>> max(zip(prices.values(), prices.keys()))
(612.78, 'AAPL')
>>> min(zip(prices.values(), prices.keys()))
(10.75, 'FB')
>>> 

 

 

1.11 named slices

There are many cases of using subscripts to slice in the program. If the subscripts are written dead, it will make the program difficult to maintain later. At this time, we can consider replacing the subscripts of slices with the built-in slice function, which can be used by all subscript slices.

>>> items = [0, 1, 2, 3, 4, 5, 6]
>>> a = slice(2, 4)
>>> items[2:4]
[2, 3]
>>> items[a]
[2, 3]
>>> items[a] = [10, 11]
>>> items
[0, 1, 10, 11, 4, 5, 6]
>>> a.start
2
>>> a.stop
4
>>> a.step
>>> 
>>> record = '....................100 .......513.25 ..........'
>>> cost = int(record[20:23]) * float(record[31:37])  # Subscript slice is written in code, not recommended
>>> cost
51325.0
>>> SHARES = slice(20, 23)
>>> PRICE = slice(31, 37)
>>> cost = int(record[SHARES]) * float(record[PRICE])
>>> cost
51325.0
>>> 

 

 

1.12 the most frequent element in the sequence

For the number of occurrences of each element in a sequence, of course, you can use the dictionary manually, but the best choice should be collections.Counter, which is specially designed for this kind of problem. The underlying implementation of Counter object is actually a dictionary, see the example code for specific use.

>>> from collections import Counter
>>> words = [
    'look', 'into', 'my', 'eyes', 'look', 'into', 'my', 'eyes',
    'the', 'eyes', 'the', 'eyes', 'the', 'eyes', 'not', 'around', 'the',
    'eyes', "don't", 'look', 'around', 'the', 'eyes', 'look', 'into',
    'my', 'eyes', "you're", 'under'
]
>>> word_counts = Counter(words)
>>> word_counts.most_common(3)  # 3 words with the most occurrences
[('eyes', 8), ('the', 5), ('look', 4)]
>>> word_counts['eyes']
8
>>> morewords = ['why','are','you','not','looking','in','my','eyes']
>>> for word in morewords:word_counts[word] += 1  # Operate like a dictionary Counter object

>>> word_counts['eyes']
9
>>> word_counts.update(morewords)  # Or use update Method to update
>>> a = Counter(words)
>>> b = Counter(morewords)
>>> c = a + b  # have access to+/-Operator to operate
>>> c
Counter({'eyes': 9, 'the': 5, 'look': 4, 'my': 4, 'into': 3, 'not': 2, 'around': 2, "don't": 1, "you're": 1, 'under': 1, 'why': 1, 'are': 1, 'you': 1, 'looking': 1, 'in': 1})
>>> 

 

 

1.13 sorting a dictionary list by a keyword

If you want to sort a list whose elements are dictionaries, such as sorted/max/min, you can usually use the corresponding lambda expression. However, if you have performance requirements, it is recommended to use the operator.itemgetter function.

If the sorted element is not a dictionary, but a class object, you can use operator.attrgetter, which is the content of section 1.14. The usage is similar to that of itemgetter, so the sample code will not be pasted.

>>> from operator import itemgetter
>>> rows = [
    {'fname': 'Brian', 'lname': 'Jones', 'uid': 1003},
    {'fname': 'David', 'lname': 'Beazley', 'uid': 1002},
    {'fname': 'John', 'lname': 'Cleese', 'uid': 1001},
    {'fname': 'Big', 'lname': 'Jones', 'uid': 1004}
]
>>> sorted(rows, key=itemgetter('fname'))
[{'fname': 'Big', 'lname': 'Jones', 'uid': 1004}, {'fname': 'Brian', 'lname': 'Jones', 'uid': 1003}, {'fname': 'David', 'lname': 'Beazley', 'uid': 1002}, {'fname': 'John', 'lname': 'Cleese', 'uid': 1001}]
>>> sorted(rows, key=itemgetter('lname','fname'))
[{'fname': 'David', 'lname': 'Beazley', 'uid': 1002}, {'fname': 'John', 'lname': 'Cleese', 'uid': 1001}, {'fname': 'Big', 'lname': 'Jones', 'uid': 1004}, {'fname': 'Brian', 'lname': 'Jones', 'uid': 1003}]
>>> 

 

 

1.20 merge multiple dictionaries or maps

When you want to find some key or value in multiple dictionaries, you can use collections.ChainMap, which will logically merge multiple dictionaries into one dictionary. If there is the same key in multiple dictionaries, the operation of the key always corresponds to the first dictionary that owns the key. It should be noted that the operation on the ChainMap object will affect the corresponding dictionary.

>>> from collections import ChainMap
>>> a = {'x': 1, 'z': 3 }
>>> b = {'y': 2, 'z': 4 }
>>> c = ChainMap(a,b)
>>> c['x']
1
>>> c['y']
2
>>> c['z']
3
>>> len(c)
3
>>> list(c.keys())
['x', 'z', 'y']
>>> list(c.values())
[1, 3, 2]
>>> c['z']
3
>>> c['z'] = 10
>>> c['z']
10
>>> c['w'] = 40
>>> c['w']
40
>>> a
{'x': 1, 'z': 10, 'w': 40}
>>> 

Tags: Python Asterisk Lambda Programming

Posted on Wed, 19 Feb 2020 08:48:31 -0500 by idevlin