Redis list [4] - quicklist of data structure

1. Overview

As mentioned earlier, before version 3.2 of Redis, ziplist was used to store under 512 small integers and short strings of under 64 bytes, while linkedlist was used to store other data.

However, due to the relatively high additional space of the linked list, prev and next pointers take up 16 bytes (the 64 bit system pointer is 8 bytes). In addition, the memory of each node is allocated separately, which will aggravate the fragmentation of memory and affect the efficiency of memory management. After redis 3.2, the data structure of the list is modified, and quicklist is used instead of ziplist and linkedlist

2. Data structure

quicklist is a mixture of ziplist and linkedlist. It is still a two-way linked list. It divides linkedlist into segments. Each segment becomes a quicklist node. Each quicklist node uses ziplist for compact storage. Multiple quicklist nodes are connected in series with two-way pointers.

The quicklist.c and quicklist.h files are defined as follows

A doubly linked list of ziplists //A two-way connected list of ziplist s

A generic doubly linked quicklist implementation

2.2.1 overall

* quicklist is a 32 byte struct (on 64-bit systems) describing a quicklist. * 'count' is the number of total entries. * 'len' is the number of quicklist nodes. * 'compress' is: -1 if compression disabled, otherwise it's the number * of quicklistNodes to leave uncompressed at ends of quicklist. * 'fill' is the user-requested (or default) fill factor. * typedef struct quicklist { quicklistNode *head; //Point to the head of the quicklist quicklistNode *tail; //Point to the end of the quicklist unsigned long count; //Total number of all data items in the list unsigned int len; //The number of quicklist nodes, that is, the number of ziplist s int fill : 16; //Ziplist data item number limit, given by list Max ziplist size unsigned int compress : 16; //Node compression depth setting, 0 given by list compress depth indicates no compression } quicklist;

As you can see, there are actually two statistics values here. Count is used to count the total number of all data items, and len is used to count the number of nodes in quicklist. Because each node ziplist can store multiple data items, there are two statistics values.

In addition, this structure of quicklist takes up 32 bytes of space in the source code. How to calculate it? This part involves the concept of bit field. The so-called "bit field" is to divide the binary bits in a byte into several different areas and explain the bits of each area. Each domain has a domain name, which allows you to operate by domain name in the program. For example, the "int fill: 16" means that instead of the entire int store fill, only 16 bits of it are used for storage.

2.2.1.1 fill

This parameter can be positive or negative. When it is a positive value, it means to limit the length of ziplist on each quicklist node according to the number of data items. For example, when this parameter is configured as 3, the ziplist of each quicklist node can contain at most 3 data items. When a negative value is taken, the ziplist length on each quicklist node is limited according to the number of bytes occupied. At this time, it can only take - 1 to - 5 values, each of which has the following meaning:

-5: ziplist size on each quicklist node cannot exceed 64KB (Note: 1KB = > 1024 bytes)
-4: ziplist size on each quicklist node cannot exceed 32KB
-3: The ziplist size on each quicklist node cannot exceed 16KB
-2: ziplist size on each quicklist node cannot exceed 8KB (- 2 is the default value given by Redis)
-1: The ziplist size on each quicklist node cannot exceed 4KB.

Causes:

The shorter the ziplist on each quicklist node, the more memory fragmentation. There are a lot of memory fragments. It is possible to generate a lot of small fragments that cannot be used in memory, thus reducing the storage efficiency. At the extreme of this situation, the ziplist on each quicklist node contains only one data item, which degenerates into a normal two-way linked list.
The longer the ziplist on each quicklist node, the more difficult it is to allocate large contiguous memory space for ziplist. It's possible that there are lots of small pieces of free space in memory (they add up a lot), but we can't find a large enough piece of free space to allocate to ziplist. This also reduces storage efficiency. At the extreme of this situation, there is only one node in the whole quicklist, and all data items are allocated in the ziplist of the only node. This actually degenerated into a ziplist.

It can be seen that the ziplist on a quicklist node should be kept at a reasonable length. How reasonable is that? Redis provides the configuration parameter list Max ziplist size to allow users to adjust and optimize according to the actual application scenarios.

2.2.1.2 list-compress-depth

Indicates the number of uncompressed nodes at both ends of the list

0 special value for uncompressed
1 means that there is a node at both ends of the quicklist that is not compressed and the node in the middle is compressed
2 means that there are two nodes at both ends of the quicklist that are not compressed, and the nodes in the middle are compressed
3 indicates that there are three nodes at both ends of the quicklist that are not compressed, and the nodes in the middle are compressed

Cause: when a large amount of data is stored in the table list, the most easily accessed data is probably the data at both ends, and the data in the middle is accessed less frequently (the access performance is also very low). If the application scenario conforms to this feature, then list also provides an option to compress the data nodes in the middle, thus further saving memory space. The Redis configuration parameter list compress depth is used to complete this setting.

2.2.2 nodes

Each linked list node is represented by an adlist.h/listNode structure

* quicklistNode is a 32 byte struct describing a ziplist for a quicklist. * We use bit fields keep the quicklistNode at 32 bytes. * count: 16 bits, max 65536 (max zl bytes is 65k, so max count actually < 32k). * encoding: 2 bits, RAW=1, LZF=2. * container: 2 bits, NONE=1, ZIPLIST=2. * recompress: 1 bit, bool, true if node is temporarry decompressed for usage. * attempted_compress: 1 bit, boolean, used for verifying during testing. * extra: 12 bits, free for future use; pads out the remainder of 32 bits typedef struct quicklistNode { struct quicklistNode *prev; //Point to the previous ziplist node struct quicklistNode *next; //Point to the next ziplist node unsigned char *zl; //If the data pointer is not compressed, it points to the ziplist structure, otherwise it points to the quicklistLZF structure unsigned int sz; //Represents the total length of the structure pointing to ziplist (memory usage length) unsigned int count : 16; //Represents the number of data items in ziplist unsigned int encoding : 2; //Encoding mode RAW 1, LZF 2 unsigned int container : 2; //How to store data none 1, zip list 2 unsigned int recompress : 1; // Decompress the tag. When viewing a compressed data, you need to decompress it temporarily. Mark this parameter as 1, and then compress it again unsigned int attempted_compress : 1; //Test related, whether compression is attempted unsigned int extra : 10; //Extended fields, reserved for use } quicklistNode;

The figure shows two structures of ziplist, one is compressed and the other is uncompressed.

2.2.3 quicklistLZF

* quicklistLZF is a 4+N byte struct holding 'sz' followed by 'compressed'. * 'sz' is byte length of 'compressed' field. * 'compressed' is LZF data with total (compressed) length 'sz' * NOTE: uncompressed length is stored in quicklistNode->sz. * When quicklistNode->zl is compressed, node->zl points to a quicklistLZF typedef struct quicklistLZF { unsigned int sz; // Number of bytes occupied after LZF compression char compressed[]; // Flexible array, storing compressed ziplist byte array } quicklistLZF;

3. Summary

quicklist combines the advantages of two-way linked list and ziplist, and makes a balance in time and space, which can greatly improve the efficiency of Redis. The time complexity of push and pop is also optimal.

Reference article:

https://blog.csdn.net/u012748735/java/article/details/82792478
https://blog.csdn.net/harleylau/java/article/details/80534159