Chapter three, dictionaries and collections

Dict type is not only widely used in various programs, but also the cornerstone of Python language. The dictionary can be seen in the namespace of the module, the attributes of the instance and the keyword parameters of the function. The built-in functions related to it are__ builtins__.__dict__ Module.


All mapping types in the standard library are implemented by dict, so they have a common limitation, that is, only hashable data types can be used as keys in these mappings (only keys have this requirement, and values do not need to be hashable data types).
Dictionary derivation

from collections.abc import Mapping, MutableMapping

my_dict = {}

print(isinstance(my_dict, Mapping))

# Dictionary derivation
DIAL_CODES = [
    (86, 'China'),
    (91, 'India'),
    (1, 'United States'),
    (62, 'Indonesia'),
    (55, 'Brazil'),
    (92, 'Pakistan'),
    (880, 'Bangladesh'),
    (234, 'Nigeria'),
    (7, 'Russia'),
    (81, 'Japan'),
]

n_dic = {key.upper():value for value,key in DIAL_CODES if value < 66}

print(n_dic)

Common mapping methods

Handling missing keys with setdefault

  • Python throws an exception when the dictionary d[k] cannot find the correct key, which is in line with Python's philosophy of "rapid failure". Perhaps every Python programmer knows that d.get(k, default) can be used instead of D [k], giving a default return value to the key that cannot be found (which is much more convenient than dealing with KeyError). However, when updating the value corresponding to a key, whether using getitem or get will be unnatural and inefficient.
"""Create a mapping from a word to its occurrence"""
import sys
import re
WORD_RE = re.compile(r'\w+')
index = {}
with open(sys.argv[1], encoding='utf-8') as fp:
    for line_no, line in enumerate(fp, 1):
        for match in WORD_RE.finditer(line):
            word = match.group()
            column_no = match.start()+1
            location = (line_no, column_no)
            # This is actually a very bad implementation, which is written just to prove the argument
            occurrences = index.get(word, [])
            occurrences.append(location)
            index[word] = occurrences
# Print the results in alphabetical order
for word in sorted(index, key=str.upper):
    print(word, index[word])

# ===========================================================================

"""Create a mapping from a word to its occurrence""" 
import sys 
import re 
WORD_RE = re.compile(r'\w+') 
index = {} 
with open(sys.argv[1], encoding='utf-8') as fp: 
    for line_no, line in enumerate(fp, 1): 
        for match in WORD_RE.finditer(line): 
            word = match.group() 
            column_no = match.start()+1 
            location = (line_no, column_no) 
            index.setdefault(word, []).append(location) ➊
# Print the results in alphabetical order
for word in sorted(index, key=str.upper): 
    print(word, index[word])

# The dict.setdefault method is modified in the original dictionary

Mapped elastic key query

  • Sometimes, for convenience, even if a key does not exist in the map, we want to get a default value when reading the value through this key. There are two ways to help us achieve this goal. One is to use defaultdict instead of ordinary dict, and the other is to define a subclass of dict and implement it in the subclass__ missing__ method.

defaultdict: handles a selection of keys that cannot be found

Sometimes, for convenience, even if a key does not exist in the map, we want to get a default value when reading the value through this key. There are two ways to help us achieve this,
One is through defaultdict This type is not ordinary dict,
The other is to define one for yourself dict And then implement it in the subclass__missing__ method.

 - defaultdict
 	collections.defaultdict When creating, you need to configure a method to create a default value when the key cannot be found,
 	Instantiating a defaultdict When, you need to provide a callable object for the constructor, and the callable object will be __getitem__ It is called when it encounters a key that cannot be found. Let __getitem__ Returns a default value
 	-- For example, we have created a new dictionary: dd=defaultdict(list),If key 'new-key' stay dd If it does not exist in, the expression dd['new-key'] Will follow the steps below.
(1)  call list() To create a new list.
(2)  Take this new list as a value,'new-key' As its key, put dd Yes.
(3) Returns a reference to this list.

"""Create a mapping from a word to its occurrence"""
import sys
import re
import collections
WORD_RE = re.compile(r'\w+')
index = collections.defaultdict(list)
with open(sys.argv[1], encoding='utf-8') as fp:
    for line_no, line in enumerate(fp, 1):
        for match in WORD_RE.finditer(line):
            word = match.group()
            column_no = match.start()+1
            location = (line_no, column_no)
            index[word].append(location)

for word in sorted(index, key=str.upper):
    print(word, index[word])
'''
hold list Construction method as default_factory To create a defaultdict. 
If index did not word Records, then default_factory Will be called to create a value for a key that cannot be queried.
The value here is an empty list, and then the empty list is assigned to index[word],It is then returned as a return value,
therefore append(location) Operations always succeed
'''
# If default is not specified when creating defaultdict_ Factory, querying for nonexistent keys will trigger KeyError.

'''
The hero behind all this is actually a special method __missing__. It will be in defaultdict Encountered a key that could not be found
 Called when default_factory,In fact, all mapping types can choose to support this feature.
'''

Special method__ missing__

  • All mapping types involve the missing method when dealing with missing keys__ missing__ Methods will only be__ getitem__ Call,
# MyDict converts non string keys into strings when querying

class MyDict(dict):
    def __missing__(self, key):
        if isinstance(key, str):
            raise KeyError(key)
        return self[str[key]]

    def get(self, key, default = None):
        try:
            return self[key]
        except KeyError:
            return default

    def __contains__(self, key):
        return key in self.keys() or str(key) in self.keys()

Variants of dictionaries*

  • collections.OrderedDict
    -This type maintains the order when adding keys, so the iterative order of keys is always the same. The popitem method of OrderedDict deletes and returns the last element in the dictionary by default, but if it is like my_odict.popitem(last=False) calls it this way, then it deletes and returns the first element added.
  • collections.ChainMap
    -This type can accommodate several different mapping objects. Then, when performing key lookup, these objects will be looked up one by one as a whole until the key is found. This function is very useful as an interpreter for languages with nested scopes. You can use a mapping object to represent the context of a scope. The part of the collections document that introduces the ChainMap object( https://docs.python.org/3/library/collections.html# There are some specific usage examples in collections. ChainMap), including the following code snippet of Python variable query rule:
import builtins 
pylookup = ChainMap(locals(), globals(), vars(builtins))
  • collections.Counter
    -This mapping type prepares an integer counter for the key. This counter is incremented each time a key is updated. So this type can be used to count hash table objects, or as a multiple set - a multiple set is a collection of elements that can appear more than once. Counter implements the + and - operators to merge records, as well as most_common([n]) is a useful method. most_common([n]) will return the most common n keys in the map and their counts in order. See the documentation for details( https://docs.python.org/3/library/collections.html#collections.Counter ). The following small example uses counter to calculate the number of occurrences of each letter in a word:
>>> ct = collections.Counter('abracadabra') 
>>> ct 
Counter({'a': 5, 'b': 2, 'r': 2, 'c': 1, 'd': 1}) 
>>> ct.update('aaaaazzz') 
>>> ct 
Counter({'a': 10, 'z': 3, 'b': 2, 'r': 2, 'c': 1, 'd': 1}) 
>>> ct.most_common(2) 
[('a', 10), ('z', 3)]
  • colllections.UserDict
    -This class actually implements the standard dict again in pure Python.

Immutable mapping type:
set theory
Set literal
Set derivation
Operation of collection
Behind dict and set
Hash table in dictionary
Implementation of dict and its results
The implementation of set and the results

Tags: Python Pycharm Pytorch

Posted on Sun, 19 Sep 2021 11:50:31 -0400 by lprocks