Thread pool is the most common and error prone component of business code

In the program, we will use various pooling technologies to cache and create expensive objects, such as thread pool, connection pool and memory pool. Generally, some objects are created in advance and put into the pool. When they are used, they are directly taken out for use and returned for reuse. The number of cached objects in the pool will be adjusted through certain strategies to realize the dynamic scaling of the pool.

Because the creation of threads is expensive, creating a large number of threads arbitrarily and without control will cause performance problems. Therefore, short and fast tasks generally consider using thread pool instead of directly creating threads.

Today, we will discuss the topic of thread pool and see what we should pay attention to when using thread pool through three production accidents.

The declaration of thread pool needs to be done manually

The Executors class in Java defines some quick tools and methods to help us quickly create a thread pool. Alibaba java development manual mentions that it is forbidden to use these methods to create thread pools, but new ThreadPoolExecutor should be used manually to create thread pools. Behind this rule are a large number of bloody production accidents. The most typical are newFixedThreadPool and newCachedThreadPool, which may lead to OOM problems due to resource depletion.

First, let's take a look at why there may be an OOM problem in newFixedThreadPool.

We write a piece of test code to initialize a single threaded FixedThreadPool and submit tasks to the thread pool for 100 million cycles. Each task will create a relatively large string and sleep for one hour:

@GetMapping("oom1")
public void oom1() throws InterruptedException {
ThreadPoolExecutor threadPool = (ThreadPoolExecutor) Executors.newFixedThreadPool(1);
//Print the thread pool information. I'll explain this code later
printStats(threadPool); 
for (int i = 0; i < 100000000; i++) {
    threadPool.execute(() -> {
        String payload = IntStream.rangeClosed(1, 1000000)
                .mapToObj(__ -> "a")
                .collect(Collectors.joining("")) + UUID.randomUUID().toString();
        try {
            TimeUnit.HOURS.sleep(1);
        } catch (InterruptedException e) {
        }
        log.info(payload);
    });
}

threadPool.shutdown();
threadPool.awaitTermination(1, TimeUnit.HOURS);

}

Soon after the program was executed, the following OOM appeared in the log:

Exception in thread "http-nio-45678-ClientPoller" java.lang.OutOfMemoryError: GC overhead limit exceeded

Looking at the source code of the new fixedthreadpool method, it is not difficult to find that the work queue of the thread pool directly creates a LinkedBlockingQueue, while the LinkedBlockingQueue of the default construction method is integer.max_ The queue of value length can be considered as unbounded:

public static ExecutorService newFixedThreadPool(int nThreads) {
return new ThreadPoolExecutor(nThreads, nThreads,
0L, TimeUnit.MILLISECONDS,
new LinkedBlockingQueue<Runnable>());
}
public class LinkedBlockingQueue<E> extends AbstractQueue<E>
implements BlockingQueue<E>, java.io.Serializable {
...
/**
 * Creates a {@code LinkedBlockingQueue} with a capacity of
 * {@link Integer#MAX_VALUE}.
 */
public LinkedBlockingQueue() {
    this(Integer.MAX_VALUE);
}

...
}

Although using newFixedThreadPool can control the number of worker threads to a fixed number, the task queue is unbounded. If there are many tasks and the execution is slow, the queue may quickly backlog and burst the memory, resulting in OOM.

Let's change the example just now to use the newCachedThreadPool method to obtain the thread pool. Shortly after the program runs, you also see the following OOM exceptions:

[11:30:30.487] [http-nio-45678-exec-1] [ERROR] [.a.c.c.C.[.[.[/].[dispatcherServlet]:175 ] - Servlet.service() for servlet [dispatcherServlet] in context with path [] threw exception [Handler dispatch failed; nested exception is java.lang.OutOfMemoryError: unable to create new native thread] with root cause
java.lang.OutOfMemoryError: unable to create new native thread

It can be seen from the log that the reason for this OOM is that threads cannot be created. Looking at the source code of newCachedThreadPool, we can see that the maximum number of threads in this thread pool is Integer.MAX_VALUE, which can be considered as having no upper limit, and its work queue SynchronousQueue is a blocking queue without storage space. This means that as long as a request arrives, a worker thread must be found to process it. If there is no idle thread, a new one will be created.

Since our tasks take 1 hour to complete, a large number of threads will be created after a large number of tasks come in. We know that threads need to allocate a certain memory space as a thread stack, such as 1MB. Therefore, unlimited thread creation will inevitably lead to OOM:

public static ExecutorService newCachedThreadPool() {
return new ThreadPoolExecutor(0, Integer.MAX_VALUE,
60L, TimeUnit.SECONDS,
new SynchronousQueue<Runnable>());

In fact, most Java development students know the characteristics of these two thread pools. They just take chances and think that just using thread pools to do some lightweight tasks can not cause queue backlog or open a large number of threads.

However, reality is often cruel. I have encountered such an accident before: after the user registers, we call an external service to send SMS. When the SMS sending interface is normal, it can respond within 100 milliseconds. With the registered amount of TPS 100, the CachedThreadPool can stably meet the demand when it occupies about 10 threads. At a certain point in time, the external short message service is unavailable, and the timeout for calling this service is very long. For example, 6000 users may come in in one minute, generating 6000 tasks for sending short messages, which requires 6000 threads. Soon, the whole application crashed due to the inability to create threads.

Therefore, I also do not recommend using the two fast thread pools provided by Executors for the following reasons:

  • We need to evaluate several core parameters of the thread pool according to our own scenarios and concurrency, including the number of core threads, the maximum number of threads, the thread recycling policy, the type of work queue, and the rejection policy, so as to ensure that the work behavior of the thread pool meets the requirements. Generally, we need to set a bounded work queue and a controllable number of threads.
  • At any time, you should specify a meaningful name for the custom thread pool to facilitate troubleshooting. When the number of threads increases sharply, threads deadlock, threads occupy a lot of CPU, thread execution exceptions and other problems occur, we often grab the thread stack. At this point, a meaningful thread name can facilitate us to locate the problem.

In addition to manually declaring the thread pool, I also recommend using some monitoring methods to observe the status of the thread pool. The thread pool component is often hard-working and unknown. Unless there is a rejection policy, most of the pressure will not throw an exception. If we can observe the backlog of thread pool queues or the rapid expansion of the number of threads in advance, we can often find and solve problems in advance.

Detailed explanation of thread pool thread management strategy

In the previous Demo, we used a printStats method to realize the simplest monitoring, and output the basic internal information of the thread pool once a second, including the number of threads, the number of active threads, the number of tasks completed, and the number of backlog tasks in the queue:

private void printStats(ThreadPoolExecutor threadPool) {
   Executors.newSingleThreadScheduledExecutor().scheduleAtFixedRate(() -> {
        log.info("=========================");
        log.info("Pool Size: {}", threadPool.getPoolSize());
        log.info("Active Threads: {}", threadPool.getActiveCount());
        log.info("Number of Tasks Completed: {}", threadPool.getCompletedTaskCount());
        log.info("Number of Tasks in Queue: {}", threadPool.getQueue().size());
log.info("=========================");
}, 0, 1, TimeUnit.SECONDS);

}

Next, let's use this method to observe the basic characteristics of a process pool.

First, customize a thread pool. This thread pool has 2 core threads, 5 maximum threads, and uses the ArrayBlockingQueue blocking queue with a capacity of 10 as the work queue. The default AbortPolicy rejection policy is used, that is, if a task is added to the thread pool, it will be thrown if it fails
RejectedExecutionException. In addition, we use the ThreadFactoryBuilder method of Jodd class library to construct a thread factory to realize the custom naming of thread pool threads.

Then, we write a test code to observe the thread pool management thread strategy. The logic of the test code is to submit tasks to the thread pool every 1 second and cycle for 20 times. Each task takes 10 seconds to complete. The code is as follows:

@GetMapping("right")
public int right() throws InterruptedException {
//Use a counter to track the number of tasks completed
AtomicInteger atomicInteger = new AtomicInteger();
//Create a thread pool with 2 core threads and 5 maximum threads, use the ArrayBlockingQueue blocking queue with a capacity of 10 as the work queue, and use the default AbortPolicy rejection policy
ThreadPoolExecutor threadPool = new ThreadPoolExecutor(
2, 5,
5, TimeUnit.SECONDS,
new ArrayBlockingQueue<>(10),
new ThreadFactoryBuilder().setNameFormat("demo-threadpool-%d").get(),
new ThreadPoolExecutor.AbortPolicy());
printStats(threadPool);
//Submit every 1 second, a total of 20 tasks
IntStream.rangeClosed(1, 20).forEach(i -> {
    try {
        TimeUnit.SECONDS.sleep(1);
    } catch (InterruptedException e) {
        e.printStackTrace();
    }
    int id = atomicInteger.incrementAndGet();
    try {
        threadPool.submit(() -> {
            log.info("{} started", id);
            //Each task takes 10 seconds
            try {
                TimeUnit.SECONDS.sleep(10);
            } catch (InterruptedException e) {
            }
            log.info("{} finished", id);
        });
    } catch (Exception ex) {
        //If the submission is abnormal, print the error message and decrease the counter by one
        log.error("error submitting task {}", id, ex);
        atomicInteger.decrementAndGet();
    }
});

TimeUnit.SECONDS.sleep(60);
return atomicInteger.intValue();

}

After 60 seconds, the page outputs 17, and three submissions fail:

And there are three similar error messages in the log:

[14:24:52.879] [http-nio-45678-exec-1] [ERROR] [.t.c.t.demo1.ThreadPoolOOMController:103 ] - error submitting task 18
java.util.concurrent.RejectedExecutionException: Task java.util.concurrent.FutureTask@163a2dec rejected from java.util.concurrent.ThreadPoolExecutor@18061ad2[Running, pool size = 5, active threads = 5, queued tasks = 10, completed tasks = 2]

We plot the log printed by printStats method into a chart and draw the following curve:

So far, we can summarize the default working behavior of the process pool:

  • The corePoolSize threads will not be initialized, and the worker thread will not be created until a task comes;
  • When the core thread is full, the thread pool will not be expanded immediately, but the tasks will be stacked in the work queue;
  • When the work queue is full, expand the thread pool until the number of threads reaches maximumPoolSize;
  • If a task comes in after the queue is full and the maximum thread is reached, it shall be handled according to the rejection policy;
  • When the number of threads is greater than the number of core threads, if the thread still has no tasks to process after waiting for keepAliveTime, shrink the thread to the number of core threads.

Understanding this strategy will help us set appropriate initialization parameters for the thread pool according to the actual capacity planning requirements. Of course, we can also change these default working behaviors through some means, such as:

  • After declaring the thread pool, immediately call the prestartAllCoreThreads method to start all core threads;
  • Pass in true to the allowCoreThreadTimeOut method to let the thread pool recycle the core thread when it is idle.

I don't know if you've ever thought about it: Java thread pool uses the work queue to store tasks that are too late to process, and then expand the thread pool when it's full. When our work queue is set to a large size, the maximum number of threads parameter is meaningless, because it is difficult for the queue to be full, or it is useless to expand the thread pool when it is full.

So, is there any way to make the thread pool more aggressive, give priority to starting more threads, and treat the queue as a backup scheme? For example, in our example, the execution of tasks is very slow and takes 10 seconds. If the thread pool can be expanded to the maximum of 5 threads, these tasks can be completed in the end, and the slow tasks will not be processed too late because the thread pool is expanded too late.

Limited to space, I only give you a general idea here:

  1. Since the thread pool will expand when the work queue is full and cannot join the queue, can we override the offer method of the queue to create the illusion that the queue is full?
  2. Since we Hack the queue, the rejection policy will be triggered when the maximum thread is reached. Can we implement a user-defined rejection policy handler and insert the task into the queue at this time?

Next, please try how to implement such a "flexible" thread pool. Tomcat thread pool also achieves a similar effect, which can be used for reference.

Be sure to make sure that the thread pool itself is not reused

Not long ago, I encountered such an accident: the production environment of a project has an alarm from time to time that the number of threads is too many, more than 2000. After receiving the alarm, I checked the monitoring and found that the number of instantaneous threads is relatively large, but it will drop down in a while. The number of threads jitters very much, and the access volume of the application does not change much.

In order to locate the problem, we grab the thread stack when the number of threads is high. After fetching, we find that there are more than 1000 custom thread pools in memory. Generally speaking, thread pools must be reused. If there are less than 5 thread pools, they can be considered normal, while more than 1000 thread pools must be abnormal.

In the project code, we did not find the place to declare the thread pool. After searching the execute keyword, we found that the original business code called a class library to obtain the thread pool, which is similar to the following business code: call the getThreadPool method of ThreadPoolHelper to obtain the thread pool, and then submit several tasks to the thread pool for processing. There is no exception.

@GetMapping("wrong")
public String wrong() throws InterruptedException {
    ThreadPoolExecutor threadPool = ThreadPoolHelper.getThreadPool();
    IntStream.rangeClosed(1, 10).forEach(i -> {
        threadPool.execute(() -> {
            ...
            try {
                TimeUnit.SECONDS.sleep(1);
            } catch (InterruptedException e) {
            }
        });
    });
    return "OK";
}

However, the implementation of ThreadPoolHelper is surprising. The getThreadPool method is used every time
Executors.newCachedThreadPool to create a thread pool.

class ThreadPoolHelper {
    public static ThreadPoolExecutor getThreadPool() {
        //The thread pool is not reused
        return (ThreadPoolExecutor) Executors.newCachedThreadPool();
    }
}

Through the study in the previous section, we can think that newCachedThreadPool will create as many threads as necessary. A business operation of business code will submit multiple slow tasks to the thread pool. In this way, executing a business operation will start multiple threads. If the concurrency of business operations is large, it is possible to start thousands of threads at once.

Then, why can we see that the number of threads will drop in the monitoring without bursting the memory?

Returning to the definition of newCachedThreadPool, you will find that the number of core threads is 0, and the keepAliveTime is 60 seconds, that is, after 60 seconds, all threads can be recycled. Well, because of this feature, our business programs don't die too ugly.

It is also very simple to fix this Bug. Use a static field to store the reference of the thread pool. The code that returns the thread pool can directly return this static field. It is important to remember our best practice to create thread pools manually. The repaired ThreadPoolHelper class is as follows:

class ThreadPoolHelper {
	private static ThreadPoolExecutor threadPoolExecutor = new ThreadPoolExecutor(
		10, 50,
		2, TimeUnit.SECONDS,
		new ArrayBlockingQueue<>(1000),
		new ThreadFactoryBuilder().setNameFormat("demo-threadpool-%d").get());
	public static ThreadPoolExecutor getRightThreadPool() {
		return threadPoolExecutor;
	}
}

The mixing strategy of thread pool needs to be carefully considered

The meaning of thread pool is reuse. Does this mean that programs should always use a thread pool?

Of course not. Through the study in the first section, we know that the core parameters of the thread pool should be specified according to the "priority" of the task, including the number of threads, recycling strategy and task queue:

  • For IO tasks with slow execution and a small number, you may want to consider more threads instead of a large queue.
  • For computing tasks with high throughput, the number of threads should not be too many, which can be CPU cores or cores * 2 (the reason is that threads must be scheduled to a CPU for execution. If the task itself is a CPU bound task, too many threads will only increase the cost of thread switching and cannot improve throughput), but a long queue may be required for buffering.

I have also encountered such a problem before. The business code uses the thread pool to asynchronously process some data in memory, but it is found through monitoring that the processing is very slow. The whole processing process is in memory calculation, does not involve IO operations, and takes several seconds to process. The CPU occupation of the application is not particularly high, which is a little incredible.

After investigation, it is found that the thread pool used by business code is also used by a background file batch task.

Maybe it's a good enough principle. This thread pool has only two core threads, and the maximum thread is also two. It uses an ArrayBlockingQueue with a capacity of 100 as the work queue and uses the CallerRunsPolicy rejection policy:

private static ThreadPoolExecutor threadPool = new ThreadPoolExecutor(
        2, 2,
        1, TimeUnit.HOURS,
        new ArrayBlockingQueue<>(100),
        new ThreadFactoryBuilder().setNameFormat("batchfileprocess-threadpool-%d").get(),
        new ThreadPoolExecutor.CallerRunsPolicy());

Here, we simulate the code of file batch processing. After the program is started, we start the dead loop logic through a thread and constantly submit tasks to the thread pool. The logic of the task is to write a large amount of data to a file:

@PostConstruct
public void init() {
    printStats(threadPool);
new Thread(() -> {
    //Simulate a large amount of data that needs to be written
    String payload = IntStream.rangeClosed(1, 1_000_000)
            .mapToObj(__ -> "a")
            .collect(Collectors.joining(""));
    while (true) {
        threadPool.execute(() -> {
            try {
                //Each time, the same data is created and written to the same file
                Files.write(Paths.get("demo.txt"), Collections.singletonList(LocalTime.now().toString() + ":" + payload), UTF_8, CREATE, TRUNCATE_EXISTING);
            } catch (IOException e) {
                e.printStackTrace();
            }
            log.info("batch file processing done");
        });
    }
}).start();

}

As you can imagine, the two thread tasks in this thread pool are quite heavy. Through the logs printed by the printStats method, we observe the burden of the offline process pool:

You can see that the two threads in the thread pool are always active, and the queue is basically full. Because the CallerRunsPolicy reject processing policy is enabled, when the thread is full and the queue is full, the task will be executed on the thread submitting the task, or the thread calling the execute method, that is, the task submitted to the thread pool must not be processed asynchronously. If the CallerRunsPolicy policy is used, it is possible that asynchronous tasks become synchronous. You can also see this from the fourth line of the log. This is why this rejection strategy is special.

I don't know why the students who wrote the code set this policy. Maybe during the test, they found that the thread pool had exceptions because the task could not be handled, and they didn't want the thread pool to discard the task, so they finally chose this rejection policy. In any case, these logs are sufficient to indicate that the thread pool is saturated.

It is conceivable that business code reuses such a thread pool for memory computing, and its fate must be tragic. We write a piece of code to test and submit a simple task to the thread pool. This task only sleeps for 10 milliseconds without other logic:

private Callable<Integer> calcTask() {
return () -> {
TimeUnit.MILLISECONDS.sleep(10);
return 1;
};
}

@GetMapping("wrong")
public int wrong() throws ExecutionException, InterruptedException {
return threadPool.submit(calcTask()).get();
}

We use wrk tool to conduct a simple pressure test on this interface. We can see that the TPS is 75, and the performance is really very poor.

On reflection, the problem is not so simple. Because the original thread pool executing IO tasks used the CallerRunsPolicy policy policy, if this thread pool is directly used for asynchronous computing, when the thread pool is saturated, the computing task will be executed in the Tomcat thread executing Web requests, which will further affect other threads of synchronous processing and even cause the whole application to crash.

The solution is very simple. You can use a separate thread pool to do such "computing tasks". The calculation task is quoted in double quotation marks because our simulation code performs a sleep operation, which is not a CPU bound operation. It is more similar to an IO bound operation. If the number of threads in the thread pool is set too small, the throughput will be limited:

private static ThreadPoolExecutor asyncCalcThreadPool = new ThreadPoolExecutor(
	200, 200,
	1, TimeUnit.HOURS,
	new ArrayBlockingQueue<>(1000),
	new ThreadFactoryBuilder().setNameFormat("asynccalc-threadpool-%d").get());

@GetMapping("right")
public int right() throws ExecutionException, InterruptedException {
	return asyncCalcThreadPool.submit(calcTask()).get();
}

After modifying the code with a separate thread pool, test the performance. The TPS is improved to 1727:

It can be seen that the problem of blindly reusing thread pools and mixing threads is that the thread pool attributes defined by others are not necessarily suitable for your task, and mixing will interfere with each other. For example, we often use virtualization technology to isolate resources, rather than allowing all applications to use physical machines directly.

With regard to the mixed use of thread pools, I would like to add another pit to you: the parallel stream function of Java 8 allows us to easily process the elements in the collection in parallel. Behind it, we share the same ForkJoinPool. The default parallelism is CPU core - 1. For CPU bound tasks, this configuration is more appropriate, but if the collection operation involves synchronous IO operations (such as database operations, external service calls, etc.), it is recommended to customize a ForkJoinPool (or ordinary thread pool). You can refer to the relevant Demo in Lecture 1.

Key review

Thread pool manages threads, and threads are valuable resources. Many application performance problems come from improper configuration and use of thread pool. In today's study, I shared with you some best practices of using thread pool through three production accidents related to thread pool.

First, the Executors class provides some quick methods to declare the thread pool. Although it is simple, it hides the parameter details of the thread pool. Therefore, when using thread pool, we must configure a reasonable number of threads, task queue, rejection strategy and thread recycling strategy according to the scenario and requirements, and clearly name the threads to facilitate troubleshooting.

Second, since the thread pool is used, you need to ensure that the thread pool is reused. It may be worse to create a new thread pool each time than not using a thread pool. If you do not directly declare the thread pool, but use the class library provided by other students to obtain a thread pool, be sure to check the source code to confirm that the instantiation mode and configuration of the thread pool meet the expectations.

Third, reusing thread pools does not mean that applications always use the same thread pool. We should choose different thread pools according to the nature of tasks. Pay special attention to the preference of IO bound tasks and CPU bound tasks for thread pool properties. If you want to reduce mutual interference between tasks, consider using isolated thread pools as needed.

Finally, I would like to emphasize that the thread pool, as the core component within the application, often lacks monitoring (if you use MQ middleware such as RabbitMQ, the operation and maintenance students will generally help us monitor the middleware). The problem of thread pool is often found after the program crashes, which is very passive. In the design section, we will revisit this problem and its solution.

Tags: Java Back-end

Posted on Sun, 28 Nov 2021 04:02:16 -0500 by dirkadirka