Principle analysis of redis cache elimination strategy

Related configuration

In order to adapt to the scenario used as cache, redis supports cache eviction and provides corresponding configuration items:


Set the upper limit of memory usage, which cannot be set to a capacity less than 1M.

The default value of the option is 0. At this time, the system will calculate an upper memory limit by itself.


Friends familiar with redis know that each database maintains two dictionaries:

  • db.dict: all key value pairs in the database, also known as the keyspace of the database

  • db.expires: a key with a life cycle and its corresponding TTL (retention time), so it is also called an expire set.

When the memory usage limit maxmemory is reached, the strategies that can be specified to clean up the cache are:

  • noeviction when the maximum memory is reached, it directly returns an error and does not overwrite or eject any data
  • Allkeys LFU eliminates the least commonly used (LFU) keys in the entire keyspace (version 4.0 or later)
  • Allkeys LRU eliminates the least recently used (LRU) key in the entire keyspace
  • All keys random eliminates random keys in the entire keyspace
  • Volatile TTL eliminates the shortest TTL key in expire set
  • Volatile LFU eliminates the least commonly used keys in expire set (version 4.0 or later)
  • Volatile LRU eliminates the least recently used (LRU) key in the expire set
  • Volatile random eliminates random keys in the expire set

When expire set is null, volatile - is consistent with noevidition *.


In order to ensure performance, the LRU and LFU algorithms used in redis are approximate implementations.

Simply put, when the algorithm selects the eliminated records, it will not traverse all records, but select some records by random sampling.

The maxmemory samples option controls the number of samples in this process. Increasing this value will increase CPU overhead, but the algorithm effect can be closer to the actual LRU and LFU.


Clearing the cache is to free up memory, but this process will block the main thread and affect the execution of other commands.

Deleting a huge record (such as a list containing hundreds of records) will cause performance problems and even cause the system to fake death.

The delayed release mechanism will release the memory of giant records and hand them over to other threads for asynchronous processing, so as to improve the performance of the system.

When this option is turned on, the memory used may exceed the maxmemory limit.

Cache elimination mechanism

A complete cache elimination mechanism needs to solve two problems:

Determine which records are eliminated - elimination strategy

Delete obsolete records - delete policy

Elimination strategy

The memory that the cache can use is limited. When the space is insufficient, priority should be given to eliminating the data that will no longer be accessed in the future and retaining the data that will be accessed frequently in the future. Therefore, the elimination algorithm will be designed around the principle of time locality, that is, if a data is being accessed, it is likely to be accessed again in the near future.

In order to adapt to the characteristics of more reads and less writes in cache, hash table will be used to realize cache in practical application. When a specific cache elimination strategy needs to be implemented, an additional bookkeeping book keeping structure needs to be introduced.

The three most common cache obsolescence strategies are reviewed below.

FIFO (first in first out)

The earlier the data enters the cache, the more likely it is that it will no longer be accessed.

Therefore, when eliminating the cache, you should select the cache record that stays in memory for the longest time.

This policy can be implemented using queues:

Advantages: simple implementation, suitable for linear access scenarios

Disadvantages: unable to adapt to specific access hotspots and poor cache hit rate

Bookkeeping cost: time O(1), space O(N)

LRU (least recently used)

After a cache is accessed, it is likely to be accessed again in the near future.

The latest access time of each cache record can be recorded. The data that has not been accessed for the longest time will be eliminated first.

This strategy can be realized by using linked list:

When updating LRU information, just adjust the pointer:

Advantages: simple implementation and can adapt to access hotspots

Disadvantages: it is sensitive to accidental access and affects the hit rate

Bookkeeping cost: time O(1), space O(N)

LRU improvement

The original LRU algorithm caches the data accessed once recently, so it can not distinguish between frequent and infrequent cache references.

This means that some unpopular low-frequency data may also enter the cache and squeeze the original hot spot records out of the cache.

In order to reduce the impact of accidental access on cache, the subsequent LRU-K algorithm is improved as follows:

Add a History Queue on the basis of LRU bookkeeping

When the number of recorded accesses is less than k, it will be recorded in the history queue (when the history queue is full, it can be eliminated by FIFO or LRU strategy). When the number of recorded accesses is greater than or equal to K, it will be removed from the history queue and recorded in the LRU cache. The larger the value of K, the higher the cache life rate, but the adaptability is poor, It takes a lot of access to weed out expired hotspot records.

After integrating various factors, LRU-2 algorithm is commonly used in practice:

Advantages: reduce the impact of accidental access on cache hit rate

Disadvantages: additional bookkeeping costs are required

Bookkeeping cost: time O(1), space O(N+M)

LFU (least frequently used)

The more frequently a cache is accessed in the near future, the more likely it is to be accessed again.

The latest access frequency of each cache record can be recorded, and the data with low access frequency will be eliminated first.

A simple way to implement LFU is to set a counter recording the number of accesses in the cache record, and then put it into a small top heap:

In order to ensure the timeliness of data, the counter should be attenuated at a certain time interval to ensure that expired hot data can be eliminated in time:

Delete policy

Common deletion strategies can be divided into the following types:

  • Real time deletion:

    Each time a new record is added, the obsolete record is found immediately. If it exists, the record is deleted from the cache

    • Advantages: good real-time performance and the most memory saving
    • Disadvantages: finding obsolete records will affect the writing efficiency, and additional bookkeeping structure is required to improve the search efficiency (such as the linked list in LRU)
  • Inert deletion:

    Two counters are set in the cache. One counts the number of accesses to the cache and the other counts the number of obsolete records
    After n accesses or the current number of obsolete records is greater than m, a batch deletion is triggered (M and N can be adjusted)

    • Advantages: it has little impact on normal cache operation, and batch deletion reduces maintenance overhead
    • Disadvantages: poor real-time performance, and occasional deletion will lead to fluctuations in access time
  • Asynchronous delete:

    Set an independent timer thread to trigger batch deletion every fixed time

    • Advantages: transparent impact on normal cache operation without additional performance overhead

    • Disadvantages: you need to increase maintenance threads and plan the cache load in advance to decide how to schedule multiple cache instances

redis implementation

redis implements LRU and LFU elimination strategies

In order to save space, redis does not use the bookkeeping structure described above to implement LRU or LFU, but uses a 24bits space in robj to record access information:

#define LRU_BITS 24

typedef struct redisObject {
    unsigned lru:LRU_BITS;  /* LRU Time (relative to global lru_clock) or
                             * LFU Data (8 bits record access frequency, 16 bits record access time) */
} robj;

Whenever a record is hit, redis will update robj.lru as the basis for the subsequent elimination algorithm:

robj *lookupKey(redisDb *db, robj *key, int flags) {
    // ...

    // According to maxmemory_policy select different update policies
    if (server.maxmemory_policy & MAXMEMORY_FLAG_LFU) {
    } else {
        val->lru = LRU_CLOCK();

The key to updating LFU and LRU is to update LFU function and LRU_ The clock macro is analyzed below.

Update LRU time

When LRU algorithm was used at that time, robj.lru recorded the timestamp of the last access, which can be used to find the records that have not been accessed for a long time.

In order to reduce system calls, redis sets a global clock server.lruclock, which is updated by background tasks:

#define LRU_CLOCK_MAX ((1<<LRU_BITS)-1) /* Max value of obj->lru */
#define LRU_CLOCK_RESOLUTION 1000 / * clock accuracy in milliseconds*/

 * server.lruclock The update frequency of is 1000/server.hz
 * If the frequency is higher than the LRU clock accuracy, use server.lruclock directly
 * Avoid the extra overhead of calling getLRUClock()
#define LRU_CLOCK() ((1000/server.hz <= LRU_CLOCK_RESOLUTION) ? server.lruclock : getLRUClock())

unsigned int getLRUClock(void) {
    return (mstime()/LRU_CLOCK_RESOLUTION) & LRU_CLOCK_MAX;

The LRU time is calculated as follows:

unsigned long long estimateObjectIdleTime(robj *o) {
    unsigned long long lruclock = LRU_CLOCK();
    if (lruclock >= o->lru) {
        return (lruclock - o->lru) * LRU_CLOCK_RESOLUTION;
    } else {
        // Handle LRU time overflow
        return (lruclock + (LRU_CLOCK_MAX - o->lru)) *

When LRU_ CLOCK_ When the resolution is 1000ms, the longest recordable LRU duration of robj.lru is 194 days 0xFFFFFF / 3600 / 24.

Update LFU count

When the LFU algorithm was used, robj.lru was divided into two parts: 16bits recorded the last access time, and 8bits was used as a counter

void updateLFU(robj *val) {
    unsigned long counter = LFUDecrAndReturn(val); // Attenuation count
    counter = LFULogIncr(counter); // Increase count
    val->lru = (LFUGetTimeInMinutes()<<8) | counter; // Update time

Update access time

The first 16 bits are used to save the last accessed time:

 * Get the UNIX minute timestamp and keep only the minimum 16bits
 * Used to represent the last attenuation time (LDT)
unsigned long LFUGetTimeInMinutes(void) {
    return (server.unixtime/60) & 65535;

Increase access count

The last 8 bits is a logarithmic counter, which stores the logarithm of the number of accesses:

#define LFU_INIT_VAL 5 

 // Logarithmic increment counter with a maximum value of 255
uint8_t LFULogIncr(uint8_t counter) {
    if (counter == 255) return 255;
    double r = (double)rand()/RAND_MAX;
    double baseval = counter - LFU_INIT_VAL;
    if (baseval < 0) baseval = 0;
    double p = 1.0/(baseval*server.lfu_log_factor+1);
    if (r < p) counter++;
    return counter;

When server.lfu_ log_ When factor = 10, the growth function of P = 1 / ((counter lfu_init_val) * server.lfu_log_factor + 1) is shown in the figure:

The random floating-point number r between 0 and 1 generated by the function rand() conforms to the uniform distribution. With the increase of counter, the probability of self increasing success decreases rapidly.

The following table shows the counter in different LFUS_ log_ In the case of factor, the number of accesses required to reach saturation (255):

Attenuation access count

Similarly, in order to ensure that expired hotspot data can be eliminated in time, redis uses the following attenuation function:

// Calculate the time from the last attenuation, in minutes
unsigned long LFUTimeElapsed(unsigned long ldt) {
    unsigned long now = LFUGetTimeInMinutes();
    if (now >= ldt) return now-ldt;
    return 65535-ldt+now;

 * Attenuation function, which returns the LFU count after attenuation according to LDT timestamp
 * Do not update counters
unsigned long LFUDecrAndReturn(robj *o) {
    unsigned long ldt = o->lru >> 8;
    unsigned long counter = o->lru & 255;
    * Attenuation factor server.lfu_decay_time is used to control the decay rate of the counter
    * Every server.lfu_decay_time minute access count minus 1
    * The default value is 1
    unsigned long num_periods = server.lfu_decay_time ? LFUTimeElapsed(ldt) / server.lfu_decay_time : 0;
    if (num_periods)
        counter = (num_periods > counter) ? 0 : counter - num_periods;
    return counter;

The maximum number of minutes that 16bits can save is about 45 days, so the LDT timestamp will be reset every 45 days.

Execute delete

Whenever the client executes a command to generate new data, redis will check whether the memory usage exceeds maxmemory. If it exceeds maxmemory, redis will try to generate new data according to maxmemory_policy obsolete data:

// The main method of redis processing commands. Before the command is actually executed, there will be various checks, including processing in the case of OOM:
int processCommand(client *c) {

    // ...

    // When maxmemory is set, try to free memory (evict) if necessary
    if (server.maxmemory && !server.lua_timedout) {
        int out_of_memory = (performEvictions() == EVICT_FAIL);
        // ...

        // If the memory release fails, and the current command to be executed does not allow OOM (generally write class commands)
        if (out_of_memory && reject_cmd_on_oom) {
            rejectCommand(c, shared.oomerr); // Return OOM to client
            return C_OK;

The performEvictions function is actually deleted:

int performEvictions(void) {
    // Loop, trying to free enough memory
    while (mem_freed < (long long)mem_tofree) {
        // ...

        if (server.maxmemory_policy & (MAXMEMORY_FLAG_LRU|MAXMEMORY_FLAG_LFU) ||
            server.maxmemory_policy == MAXMEMORY_VOLATILE_TTL)

            * redis The approximate LRU / LFU algorithm is used
            * Instead of traversing all records when you weed out objects, you sample records
            * EvictionPoolLRU It is used to temporarily store the sample data that should be preferentially eliminated
            struct evictionPoolEntry *pool = EvictionPoolLRU;
            // According to the configured maxmemory policy, get a bestkey that can be released
            while(bestkey == NULL) {
                unsigned long total_keys = 0, keys;

                // Traverse all db instances
                for (i = 0; i < server.dbnum; i++) {
                    db = server.db+i;
                    dict = (server.maxmemory_policy & MAXMEMORY_FLAG_ALLKEYS) ?
                            db->dict : db->expires;
                    // Select the set of samples (keyspace or expire set) according to the policy
                    if ((keys = dictSize(dict)) != 0) {
                        // Sample and fill the pool
                        evictionPoolPopulate(i, dict, db->dict, pool);
                        total_keys += keys;

                // Traverse the records in the pool to free up memory
                for (k = EVPOOL_SIZE-1; k >= 0; k--) {
                    if (pool[k].key == NULL) continue;
                    bestdbid = pool[k].dbid;

                    if (server.maxmemory_policy & MAXMEMORY_FLAG_ALLKEYS) {
                        de = dictFind(server.db[pool[k].dbid].dict, pool[k].key);
                    } else {
                        de = dictFind(server.db[pool[k].dbid].expires, pool[k].key);

                    // Cull records from pool
                    if (pool[k].key != pool[k].cached)
                    pool[k].key = NULL;
                    pool[k].idle = 0;

                    if (de) {
                        // Extract the key of the record
                        bestkey = dictGetKey(de);
                    } else {
                        /* Ghost... Iterate again. */

        // Finally, a best key is selected
        if (bestkey) {

            // If lazy free lazy eviction is configured, try asynchronous deletion
            if (server.lazyfree_lazy_eviction)
            // ...

        } else {
            goto cant_free; /* nothing to free... */

The evictionPoolPopulate function responsible for sampling:

#define EVPOOL_SIZE 16
struct evictionPoolEntry {
    unsigned long long idle;    /* LRU Idle time / reciprocal of LFU frequency (priority to eliminate records with larger value) */
    sds key;                    /* Keys participating in obsolescence filtering */
    sds cached;                 /* Key name cache */
    int dbid;                   /* Database ID */
// The evictionPool array is used to assist eviction operations
static struct evictionPoolEntry *evictionPoolEntry;

 * Sample in the given sampledict set
 * And record the records that should be eliminated in the evictionPool
void evictionPoolPopulate(int dbid, dict *sampledict, dict *keydict, struct evictionPoolEntry *pool) {
    int j, k, count;
    dictEntry *samples[server.maxmemory_samples];

    // Get maxmemory randomly from sampledict_ Samples sample data
    count = dictGetSomeKeys(sampledict,samples,server.maxmemory_samples);

    // Traverse sample data
    for (j = 0; j < count; j++) {
        // According to maxmemory_policy calculate sample idle time idle
        if (server.maxmemory_policy & MAXMEMORY_FLAG_LRU) {
            idle = estimateObjectIdleTime(o);
        } else if (server.maxmemory_policy & MAXMEMORY_FLAG_LFU) {
            idle = 255-LFUDecrAndReturn(o);
        } else {
            // ...

        k = 0; // Locate the index of samples in the evictionPool according to the ID (samples are in ascending order according to the ID)
        while (k < EVPOOL_SIZE && pool[k].key && pool[k].idle < idle) k++;
        if (k == 0 && pool[EVPOOL_SIZE-1].key != NULL) {
            // The sample idle time is not long enough to participate in this round of eviction
        } else if (k < EVPOOL_SIZE && pool[k].key == NULL) {
            // The corresponding position of the sample is empty and can be directly inserted into this position
        } else {
           // The position corresponding to the sample has been occupied. Move other elements to vacate the position

        // ...
        // Insert the sample data into its corresponding position k 
        int klen = sdslen(key);
        if (klen > EVPOOL_CACHED_SDS_SIZE) {
            pool[k].key = sdsdup(key);
        } else {
           // If the key length does not exceed EVPOOL_CACHED_SDS_SIZE, the sds object is reused
        pool[k].idle = idle;
        pool[k].dbid = dbid;

Tags: Database Redis Cache

Posted on Sat, 09 Oct 2021 03:32:48 -0400 by mukunthan