How does Python manage memory?

Memory management can be said to be the cornerstone of any system program based on C/C + +, which needs to be treated with caution.

We know that the native Python interpretation right is written in C, and it has its own memory management scheme.

This article is to explore the implementation details of this management scheme. We will see how the Python runtime memory is organized, how the memory needed to create an object is allocated, and how the garbage memory that is no longer needed is recycled.

This article is based on Python 2.7.


If you think of all the memory in a computer as a cake, to successfully cut a piece from the cake and give it to a Python object, you need to apply one by one.

The bottom layer, that is - 2 layer in the figure, is the original physical storage, including primary storage and secondary storage; on top of this, that is - 1 layer is our operating system layer, and the system kernel is responsible for the management and allocation of physical storage.

The next level comes to level 0, which is responsible for applying for memory from the operating system. For example, the malloc Library of C language, the memory needed by Python runtime is finally applied from the operating system through this level. Level 0 is the memory management level of Python runtime.

The PyMem related API s provided in layer 1 are mainly used to shield the differences between malloc/free on different platforms. For example, for malloc(0), some systems return NULL, while others return a pointer but there is no memory at the point.

The solution to this problem is simple and crude. It is not allowed to allocate 0 memory, at least 1, PyMem_MALLOC(0) will be converted to malloc(1).

#define PyMem_MALLOC(n)        ((size_t)(n) > (size_t)PY_SSIZE_T_MAX ? NULL \
                : malloc((n) ? (n) : 1))
#define PyMem_REALLOC(p, n)    ((size_t)(n) > (size_t)PY_SSIZE_T_MAX  ? NULL \
                : realloc((p), (n) ? (n) : 1))
#define PyMem_FREE        free

The second layer obtains memory from the first layer. How to organize the memory, how to allocate it to various objects in the third layer, and how to recycle the garbage memory are the main concerns of this article.


A pool can be understood as a cake with a size of 4k (the size of a system memory page). When memory is allocated to an object, a block is cut from a pool and distributed to the object. The limitations are:

  • The size of cake that can be cut from each pool is fixed, which is 8, 16, 32 , 256 bytes, each cut piece of cake is a block;
  • The cake blocks with more than 256 bytes are no longer cut by pool, but directly applied by malloc from the operating system;
  • The cake block requesting 0 bytes is also not cut through the pool, 0 will be changed to 1, and malloc will directly apply from the operating system;

In this way, for [1, 256] bytes, the relationship between the memory request and the actual allocated memory (block) is as follows:

Request in bytes     Size of allocated block      Size class idx
        1-8                     8                       0
        9-16                   16                       1
       17-24                   24                       2
       25-32                   32                       3
       33-40                   40                       4
       41-48                   48                       5
       49-56                   56                       6
       57-64                   64                       7
       65-72                   72                       8
        ...                   ...                     ...
      241-248                 248                      30
      249-256                 256                      31

Now we turn around and focus on a single pool. We can imagine that at any time, there are three states of this pool:

  • used: some small cake block s have been cut out, but there are still some left;
  • full: all the cake pieces have been cut out;
  • empty: I haven't cut the cake yet;

Python will store the information about the size of the cake block and the current status of each pool at the beginning of the pool. They are stored in the pool_ In the header structure, it is defined as follows:

struct pool_header {
    union { block *_padding;
        uint count; } ref;              /* number of allocated blocks    */
    block *freeblock;                   /* pool's free list head         */
    struct pool_header *nextpool;       /* next pool of this size class  */
    struct pool_header *prevpool;       /* previous pool                 */
    uint arenaindex;                    /* index into arenas of base adr */
    uint szidx;                         /* block size class index        */
    uint nextoffset;                    /* bytes to virgin block         */
    uint maxnextoffset;                 /* largest valid nextoffset      */

Suppose that there is a 4K memory to be initialized as a pool, and the pool can only cut a block of 16 bytes. Here is the initialization process:

 * Initialize the pool header, set up the free list to
 * contain just the second block.
pool->szidx = size;
size = INDEX2SIZE(size);
bp = (block *)pool + POOL_OVERHEAD;
pool->nextoffset = POOL_OVERHEAD + (size << 1);
pool->maxnextoffset = POOL_SIZE - size;
pool->freeblock = bp + size;
*(block **)(pool->freeblock) = NULL;
return (void *)bp;

pool_ szidx in the header stores the size class index. Only 16 bytes can be cut here. Then the size passed in is the size class index, and the corresponding value must be 1.

INDEX2SIZE is a macro, which converts the size class index to the corresponding block size. Here, 1 is converted to 16 and assigned to size, and size becomes 16.

#define ALIGNMENT_SHIFT         3
#define INDEX2SIZE(I) (((uint)(I) + 1) << ALIGNMENT_SHIFT)

POOL_ Overlay is also a macro, which is used to_ The memory used by the header is aligned to a multiple of 8. The aligned position is the starting position of the allocatable block. This position value is stored in the bp and finally returned as the result of the current memory allocation request.

#define ALIGNMENT               8
#define ALIGNMENT_MASK          (ALIGNMENT - 1)
#define ROUNDUP(x)              (((x) + ALIGNMENT_MASK) & ~ALIGNMENT_MASK)
#define POOL_OVERHEAD           ROUNDUP(sizeof(struct pool_header))

Nexteoffset points to the next virgin block in the pool, that is, the block that has not been allocated to any object (the block released after being used will be added to the freeblock list). Maxnexteoffset points to the last block in the pool. Freeblock is initialized to the next available block, and the block is converted to the next available block through the second level pointer The value in the pool is set to NULL. When there is a block in the pool that needs to be recycled, it will also be added to freeblock through similar second level pointer operations:

/* Recycle block pointed by p */
*(block **)p = lastfree = pool->freeblock;
pool->freeblock = (block *)p;

In this way, freeblock forms a special linked list structure.

The possible memory distribution of a used pool at a certain time:


All the pools in the used state are managed through usedpools, a special array. szidx, which is not mentioned before, is used here.

We have known that the value range of szidx is [0, 31]. Python will initialize a special array to store all pool s in the used state. The definition of this array is as follows:

#define PTA(x)  ((poolp )((uchar *)&(usedpools[2*(x)]) - 2*sizeof(block *)))
#define PT(x)   PTA(x), PTA(x)
static poolp usedpools[2 * ((NB_SMALL_SIZE_CLASSES + 7) / 8) * 8] = {
    PT(0), PT(1), PT(2), PT(3), PT(4), PT(5), PT(6), PT(7)
    , PT(8), PT(9), PT(10), PT(11), PT(12), PT(13), PT(14), PT(15)
    , PT(16), PT(17), PT(18), PT(19), PT(20), PT(21), PT(22), PT(23)
    , PT(24), PT(25), PT(26), PT(27), PT(28), PT(29), PT(30), PT(31)
#endif /* NB_SMALL_SIZE_CLASSES > 24 */
#endif /* NB_SMALL_SIZE_CLASSES > 16 */
#endif /* NB_SMALL_SIZE_CLASSES >  8 */

The design of this array is very clever. In combination with the pictures, the following is the memory layout after the initialization of usedpools:

The initialization value of each element is the distance of 2 sizeof(block) offset forward from the start position of the corresponding szidx in the array, that is, the distance of 2 squares in the figure.

If you need to check whether there is a pool with szidx as 1 in the used state, let's go to usedpools[1 + 1], that is, p in the figure. If you think of p as a pool_head pointer, then nextpool is the first memory block with szidx of 2, because the size of the two blocks between p and nextpool is 2sizeof(block *), which just offsets

union { block *_padding; uint count; } ref;
block *freeblock;

The memory occupied is exactly p! At this time, p - > nextpools and usedpools[1 + 1] are equal, that is to say, szidx is 1 and there is no corresponding pool in used state.

Suppose that a pool with szidx as 1 is applied for now. Before its header is initialized, what Python does first is to put it into usedpools:

    /* Frontlink to used pools. */
    next = usedpools[size + size]; /* == prev */
    pool->nextpool = next;
    pool->prevpool = next;
    next->nextpool = pool;
    next->prevpool = pool;
    pool->ref.count = 1;

The layout of usedpools is as follows:

At this time, usedpools [1 + 1]! = pool - > nextpool, that is, there is an available pool with szidx as 1.

Now let's use pool to see how the requested memory is allocated through usedpools. This is also the most common situation in the whole memory allocation:

/* Request nbytes bytes, convert to szidx */
size = (uint)(nbytes - 1) >> ALIGNMENT_SHIFT;
/* Find the corresponding usedpool */
pool = usedpools[size + size];
/* Determining whether usedpool is available does not mean usedpool is available */
if (pool != pool->nextpool) {
    bp = pool->freeblock;
      /* used There must be freeblock in the pool of state */
    assert(bp != NULL);
       /* If there is more than one freeblock list element, return the currently pointed block and point freeblock to the next block */
    if ((pool->freeblock = *(block **)bp) != NULL) {
        return (void *)bp;
    /* freeblock There is only one number of, indicating that none of the allocated ones have been released. Because there is only one, the current freeblock
     * After being allocated, you need to point to a new block, so you need to open up the next virgin land, that is, nextofffset
     * To point to a block, you need to first determine whether it reaches the last block */
    if (pool->nextoffset <= pool->maxnextoffset) {
        /* freeblock To virgin land */
        pool->freeblock = (block*)pool + pool->nextoffset;
        /* nextoffset Point to the next virgin land */
        pool->nextoffset += INDEX2SIZE(size);
        *(block **)(pool->freeblock) = NULL;
        return (void *)bp;
    /* When you go here, it means that the current freeblock points to the last available block. After it is allocated
     * This pool will change from used state to full state. It needs to be removed from usedpools,
     * That is, restore the corresponding position in usedpools to the state of initialization */
    next = pool->nextpool;
    pool = pool->prevpool;
    next->prevpool = pool;
    pool->nextpool = next;
    return (void *)bp;


If there is no pool available in szidx corresponding to usedpools, Python will apply for a 4k memory for initialization and put it into usedpools.

So where do I apply for 4k memory? The answer is arena.

In short, if the pool is a 4k cake, arena is a 256k cake. When the pool corresponding to a szidx in usedpools does not exist, a 4k piece of cake will be cut from arena, which will be initialized as a pool and put into usedpools.

In detail, arena is just a struct structure with a pointer pointing to 256k memory. Arena is defined as follows:

/* Record keeping for arenas. */
struct arena_object {
    /* Unlike pool, the pool header is contained in 4k memory, arena's
     * header It is independent of the 256k memory it manages. The address of 256k memory exists here */
    uptr address;
    /* The start address of the next pool virgin land, similar to nextofffset in the pool */
    block* pool_address;
    /* It can be inferred that the initial value of the remaining available pool must be 256k / 4k = 64 */
    uint nfreepools;
    /* It can be inferred that the total number of available pool s must be 256k / 4k = 64 */
    uint ntotalpools;
    /* A single chain table that links all available pools, similar to freeblock in a pool */
    struct pool_header* freepools;
    struct arena_object* nextarena;
    struct arena_object* prevarena;

Arena does not exist in system initialization. Based on the principle of no waste, the system only applies for one block of 256k memory at a time, and only applies for the second block when it is used up. However, the system will initialize multiple arena at a time_ The object header structure is managed by arena, which is a global arena_object pointer:

/* Array of objects used to track chunks of memory (arenas). */
static struct arena_object* arenas = NULL;

You can see that arena is a pointer. Combine arena_ nextarena and prevarena in the object definition may think that arena points to a two-way linked list, but they are not. Arena is actually an array.

nextarea is responsible for all unused (including exhausted) arenas_ The object is connected into a single chain table, and the header pointer is unused_arena_objects.

In addition, next arena and prevarena are jointly responsible for the arena that will be in use_ The object is connected into a double linked list, and the header pointer is usable_ Arena, these two head pointers are defined as follows:

static struct arena_object* unused_arena_objects = NULL;
static struct arena_object* usable_arenas = NULL

There are also two related definitions:

/* Arena created last time_ Number of objects */
static uint maxarenas = 0;
/* Arena created for the first time_ Number of objects */

Now you can see the initialization of arena:

/* There's no arena that hasn't been used yet_ Object is initialized */
if (unused_arena_objects == NULL) {
        uint i;
        uint numarenas;
        size_t nbytes;
        /* maxarenas A value of 0 indicates the first creation, and the number of creations is 16,
         * If it is not 0, the created quantity is 2 times of the last created quantity */
        numarenas = maxarenas ? maxarenas << 1 : INITIAL_ARENA_OBJECTS;
        /* overflow Relevant judgment omitted */
          /* Calculate numarenas arenas_ Memory occupied by object and then apply */
        nbytes = numarenas * sizeof(*arenas);
        /* Apply memory. Note that realloc is used to expand the current array */
        arenaobj = (struct arena_object *)realloc(arenas, nbytes);
        if (arenaobj == NULL)
            return NULL;
        arenas = arenaobj;
        /* New arena will be created_ Object linked to unused_arena_objects */
        for (i = maxarenas; i < numarenas; ++i) {
            arenas[i].address = 0;              /* No memory allocated yet */
            arenas[i].nextarena = i < numarenas - 1 ?
                                   &arenas[i+1] : NULL;
        /* Update global variables */
        unused_arena_objects = &arenas[maxarenas];
        maxarenas = numarenas;

Suppose it is the first time to initialize arena. After the above steps, arena now contains 16 arena_ Array of objects, unused_arena_objects contains 16 arena_ The link list of the object, below is the link of cutting cake to arena:

/* Take the first unused area_object */
assert(unused_arena_objects != NULL);
arenaobj = unused_arena_objects;
unused_arena_objects = arenaobj->nextarena;
assert(arenaobj->address == 0);
  /* Cut a 256k cake for this area_object */
arenaobj->address = (uptr)malloc(ARENA_SIZE);
  /* memory allocation failed */
if (arenaobj->address == 0) {
/* Update global variables */
/* Set Arena_ Initialization information of object */
/* freepools It refers to the exhausted released pool, so it is NULL during initialization */
arenaobj->freepools = NULL;
arenaobj->pool_address = (block*)arenaobj->address;
/* The remaining number of pool s that can be cut out */
arenaobj->nfreepools = ARENA_SIZE / POOL_SIZE;
assert(POOL_SIZE * arenaobj->nfreepools == ARENA_SIZE);
excess = (uint)(arenaobj->address & POOL_SIZE_MASK);
if (excess != 0) {
    arenaobj->pool_address += POOL_SIZE - excess;
/* Total number of pool s that can be cut out */
arenaobj->ntotalpools = arenaobj->nfreepools;
return arenaobj;

The above two pieces of code are the logic of this function:

static struct arena_object* new_arena(void);

When you want to cut a cake for the pool, if there is no arena available, new will be called_ Arena to initialize:

/* No arena available */
if (usable_arenas == NULL) {
    /* Create a new arena */
    usable_arenas = new_arena();
    if (usable_arenas == NULL) {
        goto redirect;
    usable_arenas->nextarena =
        usable_arenas->prevarena = NULL;
assert(usable_arenas->address != 0);

Cut 4k from the initialized arena and initialize it to pool:

/* pool_address Virgin land to pool */
pool = (poolp)usable_arenas->pool_address;
pool->arenaindex = usable_arenas - arenas;
assert(&arenas[pool->arenaindex] == usable_arenas);
/* pool The initial value of szidx of is 0xffff */
pool->szidx = DUMMY_SIZE_IDX;
/* pool_address Move to next location */
usable_arenas->pool_address += POOL_SIZE;
/* anena The number of remaining available pool s in minus 1 */
goto init_pool;

After switching to the pool, if the number of remaining available pools nfreepools becomes 0, then the arena is no longer an available arena. After reducing the number of nfreepools, the following processing is performed:

if (usable_arenas->nfreepools == 0) {
    assert(usable_arenas->nextarena == NULL ||
           usable_arenas->nextarena->prevarena ==
    /* From usable_ Remove from arena list */
    usable_arenas = usable_arenas->nextarena;
    if (usable_arenas != NULL) {
        usable_arenas->prevarena = NULL;
        assert(usable_arenas->address != 0);

When the blocks in a pool are released, the pool becomes a freepool. Multiple freepools are linked into a linked list. The linked list header is the freepools in arena, which is equivalent to a cache. This is very similar to freeblock. Applying for a pool from the available arena is obtained from freepools first:

/* Try to get a pool from freepools */
pool = usable_arenas->freepools;
if (pool != NULL) {
    /* freepools Point to next */
    usable_arenas->freepools = pool->nextpool;
    if (usable_arenas->nfreepools == 0) {
        /* The last pool is allocated, arena becomes full and needs to be changed from usable_ Arena removal */
        assert(usable_arenas->freepools == NULL);
        assert(usable_arenas->nextarena == NULL ||
               usable_arenas->nextarena->prevarena ==
        usable_arenas = usable_arenas->nextarena;
        if (usable_arenas != NULL) {
            usable_arenas->prevarena = NULL;
            assert(usable_arenas->address != 0);
    else {
        /* nfreepools > 0:  Then freepools must not be empty, or there is an uncut pool in arena */
        assert(usable_arenas->freepools != NULL ||
               usable_arenas->pool_address <=
               (block*)usable_arenas->address +
                   ARENA_SIZE - POOL_SIZE);

Now let's take a look at the panorama of the memory pool composed of Python program block, pool and area at a certain time.

Memory recycling

Everything in Python is an object. Each object has a refcnt variable to record the reference count of the object. When the reference count becomes zero, the object will be recycled and memory will be released (this is not absolutely accurate, but this is not the point of this article, let's just think so). The memory released here is not returned to the operating system immediately, but to the pool.

Suppose to release the memory pointed to by P, the simplest case is that after returning the pool, the pool is still in the used state. At this time, you only need to add p to the freeblock list of the pool:

assert(pool->ref.count > 0);            /* pool Not empty */
/* Get the current freeblock value to lastfree, and then p link to freeblock */
*(block **)p = lastfree = pool->freeblock;
pool->freeblock = (block *)p;
if (lastfree) {
    struct arena_object* ao;
    uint nf;  /* ao->nfreepools */
    /* Before releasing p, the pool is in the used state. After releasing p, ref.count (minus 1 first) not 0, indicating used
     * State, this is the simplest case, no additional change required */
    if (--pool->ref.count != 0) {
        /* pool isn't empty:  leave it in usedpools */

If lastfree is NULL before releasing p, it means that the pool is in the full state. After releasing p, the pool changes from full to used state. In this case, you need to put the pool back to the corresponding location of usedpools and link it with the value on the corresponding location into a linked list:

if (lastfree) {
assert(pool->ref.count > 0);
size = pool->szidx;
next = usedpools[size + size];    /* Get the location in usedpools */
prev = next->prevpool;
/* pool Insert before next: prev < - > pool < - > next */
pool->nextpool = next;
pool->prevpool = prev;
next->prevpool = pool;
prev->nextpool = pool;

Suppose that only one block is allocated to the pool, that is, p. now that p is released, the pool status will change from used to empty. At this time, Python will remove the pool from usedpools and link it to freepools in arena

if (--pool->ref.count != 0) {
/* pool Status becomes empty, removed from usedpools */
next = pool->nextpool;
prev = pool->prevpool;
next->prevpool = prev;
prev->nextpool = next;
/* Join freetools in arena */
ao = &arenas[pool->arenaindex];
pool->nextpool = ao->freepools;
ao->freepools = pool;
nf = ++ao->nfreepools;

The above three situations are when the memory of an object is released, the state of the pool changes and the corresponding processing.

In general, it is to return the block to the pool, which becomes empty and then returns to arena. The memory release logic before Python 2.5 ends here.

But there is also a hidden problem, that is, the memory of arena has not been returned from the beginning to the end!

If an application applied for a large amount of memory at the beginning due to a certain need, so much memory is not needed in the internal side, the released memory is returned to arena, but arena will not be returned to the operating system, which will cause similar memory leakage effect!

Although this kind of situation is very rare, but the small probability event inevitably occurs, someone has encountered it. Later, Python added arena management to the above code to solve this problem.

Now consider that when a pool is returned to arena, it will also cause the state change of arena:

  • From used state to free state
  • From full state to used state
  • Used state before release, used state after release
  • Do nothing

In the first case, Python releases the memory occupied by the entire arena:

/* nfreepools == ntotalpools: Indicates arena becomes empty */
if (nf == ao->ntotalpools) {
    /* Case 1.  First unlink ao from usable_arenas.
    assert(ao->prevarena == NULL ||
           ao->prevarena->address != 0);
    assert(ao ->nextarena == NULL ||
           ao->nextarena->address != 0);
    /* Fix the pointer in the prevarena, or the
     * usable_arenas pointer.
    if (ao->prevarena == NULL) {
        usable_arenas = ao->nextarena;
        assert(usable_arenas == NULL ||
               usable_arenas->address != 0);
    } else {
        assert(ao->prevarena->nextarena == ao);
        ao->prevarena->nextarena =
    /* Fix the pointer in the nextarena. */
    if (ao->nextarena != NULL) {
        assert(ao->nextarena->prevarena == ao);
        ao->nextarena->prevarena =
    /* Record that this arena_object slot is
     * available to be reused.
    ao->nextarena = unused_arena_objects;
    unused_arena_objects = ao;
    /* Free the entire arena. */
    free((void *)ao->address);
    ao->address = 0;                        /* mark unassociated */

In the second case, insert arena directly into usable_ Arena head:

/* ntotalpools == 1: Indicates arena has changed from full to used */
if (nf == 1) {
    ao->nextarena = usable_arenas;
    ao->prevarena = NULL;
    if (usable_arenas)
        usable_arenas->prevarena = ao;
    usable_arenas = ao;
    assert(usable_arenas->address != 0);

In the third case, arena is still in usable because the state does not change_ Arena. But usable_ Arena is sorted by the number of nfreepools in arena.

Why do you want to do this?

Assign arena first from usable_ The arena header is obtained. In this way, the larger the nfreepool is, the less chance it will be used later. As the pool is released, the more chance it will be returned to the operating system earlier.

Now we can see that the general principle of Python memory management is: do not occupy the memory, return it to the system as soon as possible.

Now, when a pool is released, the number of nfreepools increases. If the number of nfreepools increases more than that of arena on the right, you need to adjust the order:

/* First of all, ao will be removed from useable_ Move it out of arena. It can be divided into two cases: ao is the head node and not the head node */
if (ao->prevarena != NULL) {
    /* ao Not the head node */
    assert(ao->prevarena->nextarena == ao);
    ao->prevarena->nextarena = ao->nextarena;
else {
    /* ao Is the head node */
    assert(usable_arenas == ao);
    usable_arenas = ao->nextarena;
ao->nextarena->prevarena = ao->prevarena;
/* Traversal usable_ Arena linked list, find the location ao wants to insert */
while (ao->nextarena != NULL &&
                nf > ao->nextarena->nfreepools) {
    ao->prevarena = ao->nextarena;
    ao->nextarena = ao->nextarena->nextarena;
/* Insert to new location */
assert(ao->nextarena == NULL ||
    ao->prevarena == ao->nextarena->prevarena);
assert(ao->prevarena->nextarena == ao->nextarena);
ao->prevarena->nextarena = ao;
if (ao->nextarena != NULL)
    ao->nextarena->prevarena = ao;

The fourth case is the third special case. Although the number of nfreepools has increased, it does not exceed the number of nfreepools in arena on the right. In this case, you need to do nothing

if (ao->nextarena == NULL || nf <= ao->nextarena->nfreepools) {

The integration of the code in this section is

void PyObject_Free(void *p);

The logic of this function.


This paper analyzes the Python runtime memory organization, object memory allocation and memory recycling process in detail. The core concepts include block, pool, usedpools and arena.

The code snippets in this article are the three most important functions in obmalloc.c

void PyObject_Free(void *p);
void *PyObject_Malloc(size_t nbytes);
static struct arena_object* new_arena(void);

In Python 3, the maximum block value is increased to 512, and the corresponding szindex value is up to 63. In addition, there is no change in the organization, allocation and recovery logic of the whole memory.

Original link:

Source network, only for learning, if there is infringement, please contact delete.

Don't panic. I have a set of learning materials, including 40 + E-books, 800 + teaching videos, involving Python foundation, reptile, framework, data analysis, machine learning, etc. I'm not afraid you won't learn! Python learning materials

Pay attention to the official account [Python circle].


Tags: Programming Python less network C

Posted on Wed, 03 Jun 2020 03:51:16 -0400 by Kingskin