JUC learning - deep analysis of thread pool executor

1, Understanding of ThreadPoolExecutor

The core implementation class of thread pool in Java is ThreadPoolExecutor. This chapter analyzes the core design and implementation of java thread pool based on the source code of JDK 1.8. Let's first look at the UML class diagram of ThreadPoolExecutor to understand the inheritance relationship of ThreadPoolExecutor.

The top-level interface implemented by ThreadPoolExecutor is Executor. The top-level interface Executor provides an idea: decouple task submission and task execution. Users do not need to pay attention to how to create threads and how to schedule threads to execute tasks. Users only need to provide Runnable objects to submit the running logic of tasks to the Executor, and the Executor framework completes the deployment of threads and the execution of tasks. ExecutorService interface adds some capabilities:
(1) Expand the ability to execute tasks and supplement the methods that can generate Future for one or a batch of asynchronous tasks; (2) It provides methods to control the thread pool, such as stopping the operation of the thread pool.
AbstractExecutorService is an abstract class of the upper layer, which connects the processes of executing tasks to ensure that the implementation of the lower layer only needs to focus on a method of executing tasks. The lowest implementation class ThreadPoolExecutor implements the most complex running part. On the one hand, ThreadPoolExecutor will maintain its own life cycle, on the other hand, it will manage threads and tasks at the same time, so as to make a good combination of the two, so as to execute parallel tasks.

How does the ThreadPoolExecutor run and maintain threads and execute tasks at the same time? Its operation mechanism is shown in the figure below:

Thread pool actually constructs a producer consumer model internally, which decouples threads and tasks and is not directly related, so as to buffer tasks and reuse threads. The operation of thread pool is mainly divided into two parts: task management and thread management. The task management part acts as a producer. After the task is submitted, the thread pool will judge the subsequent flow of the task: (1) directly apply for the thread to execute the task; (2) Buffer to the queue and wait for the thread to execute; (3) Reject the task. The thread management part is consumers. They are uniformly maintained in the thread pool and allocate threads according to the task request. When the thread executes the task, it will continue to obtain new tasks for execution. Finally, when the thread cannot obtain the task, the thread will be recycled.

Let's interpret the thread pool from the source code.

2, Source code analysis of thread pool

1. Life cycle management of thread pool

The running state of the thread pool is not explicitly set by the user, but is maintained internally with the running of the thread pool. A variable is used inside the thread pool to maintain two values: runstate and worker count.

private final AtomicInteger ctl = new AtomicInteger(ctlOf(RUNNING, 0));
private static final int COUNT_BITS = Integer.SIZE - 3;
private static final int CAPACITY   = (1 << COUNT_BITS) - 1;

// runState is stored in the high-order bits
private static final int RUNNING    = -1 << COUNT_BITS;
private static final int SHUTDOWN   =  0 << COUNT_BITS;
private static final int STOP       =  1 << COUNT_BITS;
private static final int TIDYING    =  2 << COUNT_BITS;
private static final int TERMINATED =  3 << COUNT_BITS;

From the above code, we can see several states of the thread pool and a field ctl that controls the thread pool state and the number of effective threads in the thread pool. From the official comments and source code, we can know that ctl is an AtomicInteger type, encapsulating two parts of the field workerCount: indicates the number of effective threads, and runState: indicates whether it is running Closing, etc. The upper 3 bits save runState and the lower 29 bits save workerCount. Using one variable to store two values can avoid inconsistency when making relevant decisions, and it is not necessary to occupy lock resources in order to maintain the consistency between the two.

  • The calculation method for obtaining the life cycle status and the number of threads in the thread pool of the internal package is shown in the following code:
// Packing and unpacking ctl
private static int runStateOf(int c)     { return c & ~CAPACITY; } //Calculate current running status
private static int workerCountOf(int c)  { return c & CAPACITY; }  //Calculate the current number of threads
private static int ctlOf(int rs, int wc) { return rs | wc; }   //ctl generation by status and number of threads
  • ThreadPoolExecutor has five running states:
running state State description
RUNNINGIt can accept newly submitted tasks and also handle tasks in the blocking queue [when the thread pool is created, initially, the thread pool is in the RUNNING state]
STOPNo longer accepting new tasks or processing tasks in the queue will interrupt the processing thread (shutdown now())
SHUTDOWIn the closed state, the newly submitted tasks are no longer accepted, but the saved tasks in the blocking queue can continue to be processed (shutdown())
TIDYINGAll tasks have been terminated, and the workerCount (number of valid threads) is 0
TERMINATEDEnter this state after the execution of terminated()
  • The status switching of thread pool can be understood more clearly through the following figure:

  • terminated() method
    /**
     * Method invoked when the Executor has terminated.  Default
     * implementation does nothing. Note: To properly nest multiple
     * overridings, subclasses should generally invoke
     * {@code super.terminated} within this method.
     */
	// Method called when the Executor terminates. The default implementation does nothing.
    protected void terminated() { }

2. Important parameters in thread pool class

// The task cache queue is used to store tasks waiting to be executed
private final BlockingQueue<Runnable> workQueue;
//The main state lock of the thread pool. This lock should be used to change the thread pool state (such as thread pool size, runState, etc.)
private final ReentrantLock mainLock = new ReentrantLock(); 
//Used to store worksets
private final HashSet<Worker> workers = new HashSet<Worker>();  
//Thread lifetime
private volatile long  keepAliveTime; 
//Whether to allow setting the survival time for the core thread. The default is false, that is, by default, the core thread remains active even when it is idle
private volatile boolean allowCoreThreadTimeOut; 
//The size of the core pool (that is, when the number of threads in the thread pool is greater than this parameter, the submitted tasks will be put into the task cache queue)
private volatile int   corePoolSize; 
//Maximum number of threads in thread pool
private volatile int   maximumPoolSize; 
//The current number of threads in the thread pool  
private volatile int   poolSize;
//Task rejection policy      
private volatile RejectedExecutionHandler handler; 
//Thread factory, used to create threads
private volatile ThreadFactory threadFactory;  
//Used to record the maximum number of threads in the thread pool 
private int largestPoolSize; 
//Used to record the number of completed tasks  
private long completedTaskCount;  

// Reject policy by default
private static final RejectedExecutionHandler defaultHandler = new AbortPolicy();

3. Construction method of thread pool

Construction method of ThreadPoolExecutor class:

public ThreadPoolExecutor(int corePoolSize,
                          int maximumPoolSize,
                          long keepAliveTime,
                          TimeUnit unit,
                          BlockingQueue<Runnable> workQueue) {
    this(corePoolSize, maximumPoolSize, keepAliveTime, unit, workQueue,
         Executors.defaultThreadFactory(), defaultHandler);
}

public ThreadPoolExecutor(int corePoolSize,
                          int maximumPoolSize,
                          long keepAliveTime,
                          TimeUnit unit,
                          BlockingQueue<Runnable> workQueue,
                          ThreadFactory threadFactory) {
    this(corePoolSize, maximumPoolSize, keepAliveTime, unit, workQueue,
         threadFactory, defaultHandler);
}

public ThreadPoolExecutor(int corePoolSize,
                          int maximumPoolSize,
                          long keepAliveTime,
                          TimeUnit unit,
                          BlockingQueue<Runnable> workQueue,
                          RejectedExecutionHandler handler) {
    this(corePoolSize, maximumPoolSize, keepAliveTime, unit, workQueue,
         Executors.defaultThreadFactory(), handler);
}

public ThreadPoolExecutor(int corePoolSize,
                          int maximumPoolSize,
                          long keepAliveTime,
                          TimeUnit unit,
                          BlockingQueue<Runnable> workQueue,
                          ThreadFactory threadFactory,
                          RejectedExecutionHandler handler) {
    if (corePoolSize < 0 ||
        maximumPoolSize <= 0 ||
        maximumPoolSize < corePoolSize ||
        keepAliveTime < 0)
        throw new IllegalArgumentException();
    if (workQueue == null || threadFactory == null || handler == null)
        throw new NullPointerException();
    this.corePoolSize = corePoolSize;
    this.maximumPoolSize = maximumPoolSize;
    this.workQueue = workQueue;
    this.keepAliveTime = unit.toNanos(keepAliveTime);
    this.threadFactory = threadFactory;
    this.handler = handler;
}

In fact, we can also see from the above that all construction methods are finally implemented by calling the following construction method, but some parameters of other construction methods are not available, and the default values are given. Let's focus on the following construction methods:

public ThreadPoolExecutor(int corePoolSize,
                          int maximumPoolSize,
                          long keepAliveTime,
                          TimeUnit unit,
                          BlockingQueue<Runnable> workQueue,
                          ThreadFactory threadFactory,
                          RejectedExecutionHandler handler) {
    if (corePoolSize < 0 ||
        maximumPoolSize <= 0 ||
        maximumPoolSize < corePoolSize ||
        keepAliveTime < 0)   // Ensure the legitimacy of parameters
        throw new IllegalArgumentException();
    if (workQueue == null || threadFactory == null || handler == null)
        throw new NullPointerException();
    // Initialization parameters
    this.corePoolSize = corePoolSize;
    this.maximumPoolSize = maximumPoolSize;
    this.workQueue = workQueue;
    this.keepAliveTime = unit.toNanos(keepAliveTime);
    this.threadFactory = threadFactory;
    this.handler = handler;
}
  • corePoolSize: core thread size. When a task is submitted to the thread pool, the thread pool will create a thread to execute the task. Even if there are other idle threads that can process the task, a new thread will be created. When the number of working threads is greater than the number of core threads, it will not be created. If the prestartAllCoreThreads method of the thread pool is called, the thread pool will create and start the core threads in advance.
  • maximumPoolSize: the maximum number of threads allowed to be created in the thread pool. If the queue is full and the number of threads created is less than the maximum number of threads, the thread pool will create a new thread to execute the task. If we use unbounded queue, all tasks will be added to the queue, and this parameter has no effect.
  • keepAliveTime: the time that the worker thread of the thread pool remains alive after it is idle. If there is no task processing, some threads will be idle. If the idle time exceeds this value, it will be recycled. If there are many tasks and the execution time of each task is relatively short, to avoid repeated thread creation and recycling, you can increase this time and improve thread utilization.
  • Unit: the time unit of keepAliveTIme. The selectable units are day, hour, minute, millisecond, subtle, millisecond and nanosecond. Type is an enumeration java.util.concurrent.TimeUnit. This enumeration is also often used. If you are interested, you can take a look at its source code.
  • workQueue: work queue, which is used to cache the blocking queue of pending tasks. There are four common types, which will be introduced later in this article.
  • threadFactory: the factory that creates threads in the thread pool. You can set a more meaningful name for each created thread through the thread factory.
  • handler: saturation strategy. When the thread pool is unable to handle new tasks, it needs to provide a strategy to handle new tasks submitted. By default, there are four strategies, which will be mentioned later in the article.

4. Execution of tasks

In the ThreadPoolExecutor class, the core task submission method is the execute() method. Although tasks can be submitted through submit, in fact, the final call in the submit method is the execute() method, so we only need to study the implementation principle of the execute() method:

public void execute(Runnable command) {
    if (command == null)
        throw new NullPointerException();
    
    /*
     * Proceed in 3 steps:
     *
     * 1. If fewer than corePoolSize threads are running, try to
     * start a new thread with the given command as its first
     * task.  The call to addWorker atomically checks runState and
     * workerCount, and so prevents false alarms that would add
     * threads when it shouldn't, by returning false.
     * 
     * If the number of tasks is less than the number of threads in the thread pool (that is, the number of core threads), start a new thread to process the submitted task as
     * The first task of the new thread;
     * [The call to addWorker checks runState and workerCount atomically,
     * This prevents false positives when a thread should not be added by returning false.] 
     *
     * 2. If a task can be successfully queued, then we still need
     * to double-check whether we should have added a thread
     * (because existing ones died since last checking) or that
     * the pool shut down since entry into this method. So we
     * recheck state and if necessary roll back the enqueuing if
     * stopped, or start a new thread if there are none.
     *
     * If a task can be successfully queued, we still need to carefully check two points,
     * First, should we add a thread (because some existing threads have died since the last check),
     * Second, whether the thread pool state has changed to non running state at this time.
     * Therefore, we recheck the status. If the inspection fails, we remove the listed tasks,
     * If the check passes and the number of threads in the thread pool is 0, start a new thread.
     *
     * 3. If we cannot queue task, then we try to add a new
     * thread.  If it fails, we know we are shut down or saturated
     * and so reject the task.
     *
     * If we can't queue tasks, we try to add a new thread.
     * If it fails, we know we are closed or saturated, so we reject the task.
     *
     */
    int c = ctl.get();
    // Step 1: judge whether the number of threads in the thread pool is less than the thread pool size
    if (workerCountOf(c) < corePoolSize) {
        // Add a worker thread and add a task. If it is successful, it returns. Otherwise, proceed to step 2
        // true means to use corePoolSize as the boundary constraint, otherwise use maximumPoolSize
        if (addWorker(command, true))
            return;
        c = ctl.get();
    }
    // Step 2: if workercountof (c) < corepoolsize or addWorker fails, go to step 2
    // Verify whether the thread pool is in Running status and whether the task is successfully put into the workQueue (blocking queue)
    if (isRunning(c) && workQueue.offer(command)) {
        int recheck = ctl.get();
        // Verify again. If the thread pool is not Running and the task is successfully removed from the task queue, the task will be rejected
        if (! isRunning(recheck) && remove(command))
            reject(command);
        // If the number of worker threads in the thread pool is 0, create a new thread for an empty task
        else if (workerCountOf(recheck) == 0)
            // If the thread pool is not Running, it cannot be added
            addWorker(null, false);
    }
    // Step 3: if the thread pool is not in Running status or task listing fails,
    // Add worker again after trying to expand maximumPoolSize. If it fails, the task will be rejected
    else if (!addWorker(command, false))
        reject(command);
}

In the official comments, you can clearly know an execution step of the thread pool:

5. addWorker method

  • Method description: addWorker(Runnable firstTask, boolean core) method, as the name suggests, adds a worker thread with a task to the thread pool.
  • Parameter Description:
    1. Runnable firstTask: the task that the newly created thread should run first (empty if not).
    2. boolean core: this parameter determines the constraint condition of thread pool capacity, that is, what is the limit value of the current number of threads. If the parameter is true, use corePollSize as the constraint value; otherwise, use maximumPoolSize.
  • See the following code analysis for the specific execution process:
private boolean addWorker(Runnable firstTask, boolean core) {
    // Outer loop: judge thread pool status
    retry:
    for (;;) {
        int c = ctl.get();
        int rs = runStateOf(c);

       /** 
         * 1.The thread pool is in non Running status (in Running status, you can add core threads or accept tasks)
         * 2.The thread is in shutdown state and firstTask is empty and the queue is not empty
         * 3.If condition 1 is satisfied and condition 2 is not satisfied, false is returned
         * 4.Interpretation of condition 2: when the thread pool is in the shutdown state and the task queue is not empty, you can add a thread of empty tasks to process the tasks in the queue
         */
        if (rs >= SHUTDOWN &&
            ! (rs == SHUTDOWN &&
               firstTask == null &&
               ! workQueue.isEmpty()))
            return false;

        // Inner loop: adds a thread to the thread pool and returns the result of whether the thread was added successfully
        for (;;) {
            int wc = workerCountOf(c);
            // Verify whether the number of existing threads in the thread pool exceeds the limit:
            // 1. Maximum thread pool CAPACITY 
            // 2.corePoolSize or maximumPoolSize (depending on the input core)
            if (wc >= CAPACITY ||
                wc >= (core ? corePoolSize : maximumPoolSize))
                return false;
            // Make the number of working threads + 1 through CAS operation to jump out of the outer loop
            if (compareAndIncrementWorkerCount(c))
                break retry;
            // Thread + 1 failed, reread ctl
            c = ctl.get();  // Re-read ctl
            // If the thread pool state is no longer running at this time, the outer loop is restarted
            if (runStateOf(c) != rs)
                continue retry;
            // Other CAS failed because the number of working threads changed. Continue the inner loop and try CAS to increase the number of threads by 1
            // else CAS failed due to workerCount change; retry inner loop
        }
    }

   /**
     * Number of threads + 1 successful subsequent operations: add to the worker thread collection and start the worker thread
     */
    boolean workerStarted = false;
    boolean workerAdded = false;
    Worker w = null;
    try {
        w = new Worker(firstTask);
        final Thread t = w.thread;
        if (t != null) {
            // The following code needs to be locked: thread pool master lock
            final ReentrantLock mainLock = this.mainLock;
            mainLock.lock();
            try {
                // Recheck while holding lock.
                // Back out on ThreadFactory failure or if
                // shut down before lock acquired.
                // When the thread factory fails to create a thread or closes before acquiring a lock, it exits
                int rs = runStateOf(ctl.get());

                // Verify again whether the thread pool is running or thread pool shutdown, but the thread task is empty
                if (rs < SHUTDOWN ||
                    (rs == SHUTDOWN && firstTask == null)) {
                    // If the thread has been started, an illegal thread state exception is thrown
                    // Why does this state exist? Unresolved
                    if (t.isAlive()) // precheck that t is startable
                        throw new IllegalThreadStateException();
                    // Join thread pool
                    workers.add(w);
                    int s = workers.size();
                    // If the current number of worker threads exceeds the maximum number of threads that have ever occurred in the thread pool, refresh the latter value
                    if (s > largestPoolSize)
                        largestPoolSize = s;
                    workerAdded = true;
                }
            } finally {
                // Release lock
                mainLock.unlock();
            }
            // The worker thread was added successfully. Start the thread
            if (workerAdded) {
                t.start();
                workerStarted = true;
            }
        }
    } finally {
        // If the thread fails to start, it enters addWorkerFailed
        if (! workerStarted)
            addWorkerFailed(w);
    }
    return workerStarted;
}
  • addWorkerFailed method
/**
  *	Rollback worker thread creation. 
  */
private void addWorkerFailed(Worker w) {
    final ReentrantLock mainLock = this.mainLock;
    mainLock.lock();
    try {
        if (w != null)
            workers.remove(w);
        decrementWorkerCount();  
        tryTerminate();
    } finally {
        mainLock.unlock();
    }
}

Reference article:

  1. https://tech.meituan.com/2020/04/02/java-pooling-pratice-in-meituan.html [meituan technical team blog]

Tags: Java Back-end Multithreading JUC

Posted on Sun, 05 Dec 2021 08:11:02 -0500 by dagee