Learning Java concurrency tools -- a brief talk about ThreadLocal

preface

The previous blog combed some common knowledge points of thread pool. This blog began to talk about ThreadLocal. In fact, ThreadLocal has two usage scenarios on the whole: * * 1. In order to avoid thread safety, each thread needs to share an object exclusively. 2. In each thread, data sharing is required before different business methods** These two usage scenarios will be the entry points for us to summarize ThreadLocal.

ThreadLocal -- an object shared exclusively between threads

Start with SimpleDateFormat

If you want to ask if SimpleDateFormat is thread safe, most programmers may know that it is thread unsafe. This class is usually used for date format conversion. Its internal parser function uses the calender object for date conversion. The relevant source code is not thread safe. For this, please refer to the summary of relevant Daniel. Our blog starts with the use of SimpleDateFormat and summarizes the first use scenario of ThreadLocal

1. Basic usage example

/**
 * autor:liman
 * createtime:2021/11/7
 * comment:SimpleDateFormat instance under simple multithreading
 * Two threads print and there is no problem
 */
@Slf4j
public class SimpleDateFormatDemo {

    public static void main(String[] args) {
        new Thread(()->{
            String date = new SimpleDateFormatDemo().dateFormat(10);
            System.out.println(date);
        }).start();
        new Thread(()->{
            String date = new SimpleDateFormatDemo().dateFormat(1000);
            System.out.println(date);
        }).start();
    }

    public String dateFormat(int seconds){
        //Calculated from 1970-01-01 00:00:00
        Date date = new Date(1000 * seconds);
        SimpleDateFormat dateFormat = new SimpleDateFormat("yyyy-MM-dd hh:mm:ss");
        return dateFormat.format(date);
    }
}

Two threads print date format, there is no problem.

If at this time, 10 threads want to use the format of SimpleDateFormat to specify date output, what should we do?

2. Ten threads use the same format SimpleDateFormat

/**
 * autor:liman
 * createtime:2021/11/8
 * comment:Multiple threads print the date format converted by SimpleDateFormat
 */
@Slf4j
public class SimpleDateFormatMultiThread {

    public static void main(String[] args) {
        for (int i = 0; i < 10; i++) {
            int finalI = i;
            new Thread(()->{
                String date = new SimpleDateFormatMultiThread().dateFormat(finalI);
                System.out.println(date);
            }).start();
        }
    }

    public String dateFormat(int seconds) {
        //Calculated from 1970-01-01 00:00:00
        Date date = new Date(1000 * seconds);
        SimpleDateFormat dateFormat = new SimpleDateFormat("yyyy-MM-dd hh:mm:ss");
        return dateFormat.format(date);
    }

}

There seems to be no problem running, because each thread has its own SimpleDateFormat object

3. If there are 1000 threads?

At this time, we will use the thread pool

/**
 * autor:liman
 * createtime:2021/11/8
 * comment:The thread pool runs the contents of SimpleDateFormat
 */
@Slf4j
public class SimpleDateFormatThreadPool {
    private static ExecutorService threadPool = Executors.newFixedThreadPool(10);

    public static void main(String[] args) {
        for (int i = 0; i < 1000; i++) {
            int finalI = i;
            //Each thread creates a SimpleDateFormat, which is equivalent to creating 1000 SimpleDateFormat objects
            threadPool.submit(new Thread(() -> {
                String date = new SimpleDateFormatThreadPool().dateFormat(finalI);
                System.out.println(date);
            }));
        }
        threadPool.shutdown();
    }
}

public String dateFormat(int seconds) {
    //Calculated from 1970-01-01 00:00:00
    Date date = new Date(1000 * seconds);
    SimpleDateFormat dateFormat = new SimpleDateFormat("yyyy-MM-dd hh:mm:ss");
    return dateFormat.format(date);
}

It seems that there is still no problem after running, but 1000 threads have created 1000 SimpleDateFormat objects, which... Is too violent.

Shared SimpleDateFormat

In the above examples, each thread creates a SimpleDateFormat object without any exceptions, but it is not necessary for each thread to create a SimpleDateFormat. The specified format is the same. Why not implement this logic through static variables?

/**
 * autor:liman
 * createtime:2021/11/8
 * comment:Thread pool share SimpleDateFormat
 */
@Slf4j
public class SimpleDateFormatStaticThreadPool {
    private static ExecutorService threadPool = Executors.newFixedThreadPool(10);
    //Share SimpleDateFormat
    private static SimpleDateFormat dateFormat = new SimpleDateFormat("yyyy-MM-dd hh:mm:ss");
    //Record duplicate data
    private static Set<String> resultSet = new HashSet<>();

    public static void main(String[] args) {
        for (int i = 0; i < 10000; i++) {
            int finalI = i;
            threadPool.submit(new Thread(() -> {
                String date = new SimpleDateFormatStaticThreadPool().dateFormat(finalI);
                System.out.println(date);
            }));
        }
        threadPool.shutdown();
    }

    public String dateFormat(int seconds) {
        //Calculated from 1970-01-01 00:00:00
        Date date = new Date(1000 * seconds);
        String result = dateFormat.format(date);
        if(resultSet.contains(result)){
            log.warn("{},Duplicate data occurred",result);
        }
        resultSet.add(result);
        return result;
    }
}

As you can see, I added a set in the program to record whether duplicate data occurs. The running results are as follows

A pile of duplicate data, because multiple threads share SimpleDateFormat, there is a thread safety problem.

Lock?

/**
 * autor:liman
 * createtime:2021/11/8
 * comment:Locking solves the thread safety problem of SimpleDateFormat
 */
@Slf4j
public class SimpleDateFormatLockSlove {
    private static ExecutorService threadPool = Executors.newFixedThreadPool(10);
    //Share SimpleDateFormat
    private static SimpleDateFormat dateFormat = new SimpleDateFormat("yyyy-MM-dd hh:mm:ss");
    //Record duplicate data
    private static Set<String> resultSet = new HashSet<>();

    public static void main(String[] args) throws InterruptedException {
        for (int i = 0; i < 100; i++) {
            int finalI = i;
            threadPool.submit(new Thread(() -> {
                String date = new SimpleDateFormatLockSlove().dateFormat(finalI);
                System.out.println(date);
            }));
        }
    }

    public String dateFormat(int seconds) {
        //Calculated from 1970-01-01 00:00:00
        Date date = new Date(1000 * seconds);
        String result = "";
        //Lock, lock the key code
        synchronized (SimpleDateFormatLockSlove.class) {
            result = dateFormat.format(date);
            resultSet.add(result);
        }
        return result;
    }
}

The operation results are normal and very smooth

Optimize with ThreadLocal

synchronized can be solved naturally, but... Is this way of queuing running too slow? In the case of high concurrency, ThreadLocal can better solve such problems. If thread safety problems will occur through thread sharing, ThreadLocal is a good container to ensure that each thread has a copy of the specified object. Modifications and operations between threads are isolated.

/**
 * autor:liman
 * createtime:2021/11/8
 * comment:Using ThreadLocal to solve the thread safety problem of SimpleDateFormat
 */
@Slf4j
public class SimpleDateFormatThreadLocal {
    private static ExecutorService threadPool = Executors.newFixedThreadPool(10);
    //Record duplicate data
    private static Set<String> resultSet = new HashSet<>();

    public static void main(String[] args) {
        for (int i = 0; i < 10000; i++) {
            int finalI = i;
            threadPool.submit(new Thread(() -> {
                String date = dateFormat(finalI);
                System.out.println(date);
            }));
        }
        threadPool.shutdown();
    }

    public String dateFormat(int seconds) {
        //Calculated from 1970-01-01 00:00:00
        Date date = new Date(1000 * seconds);
        //Directly call the get method of ThreadLocal to obtain the SimpleDateFormat object
        String result = ThreadSafeSimpleDateFormat.dateFormatThreadLocal.get().format(date);
        if(resultSet.contains(result)){
            log.warn("{},Duplicate data occurred",result);
        }
        resultSet.add(result);
        return result;
    }
}

//Construct ThreadLocal with an inner class
class ThreadSafeSimpleDateFormat{
    public static ThreadLocal<SimpleDateFormat> dateFormatThreadLocal = new ThreadLocal<SimpleDateFormat>(){
        //Copy the initialValue method in ThreadLocal and put it into the SimpleDateFormat object
        @Override
        protected SimpleDateFormat initialValue() {
            return new SimpleDateFormat("yyyy-MM-dd hh:mm:ss");
        }
    };
}

Silky, no thread safety issues.

lambda expressions are supported

public static ThreadLocal<SimpleDateFormat> dateFormatThreadLocal
            = ThreadLocal.withInitial(()->new SimpleDateFormat("yyyy-MM-dd hh:mm:ss"));

ThreadLocal -- data sharing within the same thread

In the actual web development, there are actually many scenarios as shown in the figure below. The current request has a long request chain. The methods on this request chain occasionally operate user data, so it needs a public place to share data and synchronize. At this time, ThreadLocal can perfectly meet this demand.

There is no need for synchronized and concurrent HashMap, and the efficiency is still very high. ThreadLocal is the optimal solution for this scenario.

Example code

/**
 * autor:liman
 * createtime:2021/11/8
 * comment: The problem of continuously transmitting user information
 */
@Slf4j
public class UserInfoProblem {

    public static void main(String[] args) {
        log.info("Start call");
        ServiceOne serviceOne = new ServiceOne();
        serviceOne.process();
        log.info("End of call");
    }
    
    
}

class ServiceOne{
    public void process(){
        User user = new User("Niu ");
        //Here, the contents of setThreadLocal are manually, and the initValue method is not called
        UserContextHolder.holder.set(user);
        System.out.println("service one Set user"+user);
        ServiceTwo serviceTwo = new ServiceTwo();
        serviceTwo.process();
    }
}

class ServiceTwo{
    public void process(){
        User user = UserContextHolder.holder.get();
        System.out.println("server two Objects from"+user);
        ServiceThree serviceThree = new ServiceThree();
        serviceThree.process();
    }
}

class ServiceThree{
    public void process(){
        User user = UserContextHolder.holder.get();
        System.out.println("server three Objects from"+user);
        //No, remember to remove
        UserContextHolder.holder.remove();
    }
}

//Holder means holder. The class at the end of holder is often seen in some source code
class UserContextHolder{
    public static ThreadLocal<User> holder =
            new ThreadLocal<>();
}

class User{
    String name;

    public User(String name) {
        this.name = name;
    }

    public String getName() {
        return name;
    }

    public void setName(String name) {
        this.name = name;
    }

    @Override
    public String toString() {
        return "User{" +
                "name='" + name + '\'' +
                '}';
    }
}

So far, we have summarized the two usage scenarios of ThreadLocal. But... This is not the end

Benefits of ThreadLocal

1. Thread safety can be achieved

2. There is no need to lock, and the execution efficiency is relatively high

3. Use memory more efficiently and save overhead. At least you don't have to new a SimpleDateFormat object for each thread

4. It can avoid the trouble of sharing data between threads.

Related source code

In fact, in the Thread class, there is a threadLocals attribute, which is of type ThreadLocalMap

ThreadLocalMap is actually an object of Map type. The key value of this Map is ThreadLocal, and value is the value in ThreadLocal. In other words, multiple ThreadLocal are allowed in the same thread, as shown in the following figure

get method and setInitialValue method of ThreadLocal

public T get() {
    //Get current thread
    Thread t = Thread.currentThread();
    //Gets the ThreadLocalMap in the thread
    ThreadLocalMap map = getMap(t);
    if (map != null) {//If ThreadLocalMap is not empty, take out the ThreadLocal calling this method and return its value
        ThreadLocalMap.Entry e = map.getEntry(this);
        if (e != null) {
            @SuppressWarnings("unchecked")
            T result = (T)e.value;
            return result;
        }
    }
    //If ThreadLocalMap is empty, the setInitialValue method is called
    return setInitialValue();
}

private T setInitialValue() {
    //Here, you will call the initialValue method, which directly returns null in the source code. Therefore, the subclass needs to copy this method to initialize the value in ThreadLocal.
    T value = initialValue();
    Thread t = Thread.currentThread();
    ThreadLocalMap map = getMap(t);
    if (map != null) {
        map.set(this, value);
    } else {//If the current ThreadLocalMap is empty, a ThreadLocalMap is created
        createMap(t, value);
    }
    if (this instanceof TerminatingThreadLocal) {
        TerminatingThreadLocal.register((TerminatingThreadLocal<?>) this);
    }
    return value;
}

If initialValue is not overridden, null is returned directly

set method of ThreadLocal

//This is simple
public void set(T value) {
    Thread t = Thread.currentThread();
    ThreadLocalMap map = getMap(t);
    if (map != null) {
        map.set(this, value);
    } else {
        createMap(t, value);
    }
}

remove method of ThreadLocal

public void remove() {
    ThreadLocalMap m = getMap(Thread.currentThread());
    if (m != null) {
        m.remove(this);
    }
}

Finally, let's talk about ThreadLocalMap. ThreadLocalMap is an internal class in the Thread class. In fact, it maintains the ThreadLocal Map table through an array. The relevant source code is as follows

static class ThreadLocalMap {

    /**
     * The entries in this hash map extend WeakReference, using
     * its main ref field as the key (which is always a
     * ThreadLocal object).  Note that null keys (i.e. entry.get()
     * == null) mean that the key is no longer referenced, so the
     * entry can be expunged from table.  Such entries are referred to
     * as "stale entries" in the code that follows.
     */
    //The weak reference is used here, and the Key is a weak reference
    static class Entry extends WeakReference<ThreadLocal<?>> {
        /** The value associated with this ThreadLocal. */
        Object value;

        Entry(ThreadLocal<?> k, Object v) {
            super(k);
            value = v;
        }
    }
	
	private Entry[] table;
	
}

Problems using ThreadLocal

Unrecoverable value

java developers should have heard of the problem of memory leakage. Memory leakage is actually that an object is no longer used, but GC still can't recover the memory occupied by this object. Over time, these objects will exceed our memory limit, resulting in the problem of OOM. This may occur if ThreadLocal is not used properly

In the ThreadLocalMap class described above, weak references are used to construct the key in the Entry. Key is assigned a value with the constructor of weak reference. It is associated with weak reference, indicating that it can be recycled by the garbage collector. If an object is only associated with weak reference at a certain time, the GC (garbage collector) will recycle the object. However... When constructing the value of the Entry of ThreadLocalMap, strong reference is used.

Under normal circumstances, when the Thread terminates and the Thread object is recycled, its internal ThreadLocalMap will be recycled, and the key and value corresponding to the Entry in ThreadLocalMap will be recycled. However, if the Thread does not terminate, the corresponding value in the Entry may be difficult to recycle. At this time, there is the following reference chain

Thread - > threadlocalmap - > entry (key is a weak reference and may be recycled) - > value

As a result, value cannot be retrieved all the time, and GC only retrieves key. Over time, it's OOM.

In fact, JDK has considered this problem. In the set, remove and rehash methods, it will scan the Entry with null key and set the value of the corresponding Entry to null, so that the value can be recycled normally.

However, if we forget to trigger these methods, OOM may still occur. This is also a mandatory requirement in the Alibaba development manual for the use of ThreadLocal, which requires that ThreadLocal be manually remove d when it is not in use.

NPE problem

The source code of the initialValue method in ThreadLocal is as follows

protected T initialValue() {
    return null;
}

Since the method here declares a generic type, you should pay attention to the NPE problem of boxing and unpacking.

summary

ThreadLocal, two scenarios, must be manually remove d when not in use, and pay attention to the operation of packing and unpacking. At the same time, it should be noted that in Spring, many similar to the end of ContextHolder are related to ThreadLocal.

Tags: Java Back-end

Posted on Thu, 11 Nov 2021 12:22:38 -0500 by davidohuf