Detailed explanation of AQS principle of Java concurrency

Thread blocking primitive

Java thread blocking and wakeup are achieved through the park and unpark methods of the Unsafe class.

public class Unsafe {
  ...
  public native void park(boolean isAbsolute, long time);
  public native void unpark(Thread t);
  ...
}
Copy code

These two methods are both native methods, which are the core functions implemented by C language. Park means to stop and let the currently running thread Thread.currentThread() sleep. unpark means to release the shutdown and wake up the specified thread. These two methods are implemented at the bottom by using the semaphore mechanism provided by the operating system. The specific implementation process should go deep into the C code, and there is no specific analysis here for the time being. The two parameters of the park method are used to control how long to sleep. The first parameter isAbsolute indicates whether the second parameter is absolute time or relative time, and the unit is milliseconds.

The thread will run all the time from startup. Except for the task scheduling policy of the operating system, it will pause only when park is called. The secret that a lock can pause a thread is that the lock calls the park method at the bottom.

parkBlocker

There is an important attribute parkBlocker in the Thread object Thread, which saves what the current Thread parks for. It's like parking a lot of cars in the parking lot. These car owners come to an auction and drive away after taking pictures of what they want. So the parkBlocker here roughly refers to this "auction". It is the manager and coordinator of a series of conflicting threads. It controls which Thread should sleep and wake up.

class Thread {
  ...
  volatile Object parkBlocker;
  ...
}
Copy code

When the thread is awakened by unpark, this property will be set to null. Unsafe.park and unpark do not help us set the parkBlocker property. The tool class responsible for managing this property is LockSupport, which simply wraps the two unsafe methods.

class LockSupport {
  ...
  public static void park(Object blocker) {
     Thread t = Thread.currentThread();
     setBlocker(t, blocker);
     U.park(false, 0L);
     setBlocker(t, null); // Set null after waking up
  }

  public static void unpark(Thread thread) {
     if (thread != null)
        U.unpark(thread);
     }
  }
  ...
}
Copy code

Java's lock data structure realizes sleep and wake-up by calling LockSupport. The value of the parkBlocker field in the thread object is the "queue manager" we want to talk about below.

Queue manager

When multiple threads compete for the same lock, there must be a queuing mechanism to string the threads that failed to get the lock together. When the lock is released, the lock manager will select a suitable thread to occupy the newly released lock. There will be such a queue manager inside each lock, which will maintain a waiting thread queue. The queue manager in ReentrantLock is AbstractQueuedSynchronizer. Its internal waiting queue is a two-way list structure. The structure of each node in the list is as follows.

class AbstractQueuedSynchronizer {
  volatile Node head;  // The queue head thread will take priority to obtain the lock
  volatile Node tail;  // The thread that failed to grab the lock is appended to the end of the queue
  volatile int state; // Lock count
}

class Node {
  Node prev;
  Node next;
  Thread thread; // One thread per node
  
  // The following two special fields can not be understood first
  Node nextWaiter; // Is a shared lock or an exclusive lock requested
  int waitStatus; // Fine state descriptor
}
Copy code

When the lock is not successful, the current thread will bring itself into the tail of the waiting list, then call LockSupport.park to sleep itself. When other threads are unlocked, they will take a node from the header of the linked list and call LockSupport.unpark to wake it up.

 

AbstractQueuedSynchronizer class is an abstract class. It is the parent class of all lock queue managers. For various forms of locks in JDK, its internal queue managers inherit this class. It is the core cornerstone of Java concurrency world. For example, ReentrantLock, ReadWriteLock, CountDownLatch, Semaphone, ThreadPoolExecutor and internal queue managers are all its subclasses. This abstract class exposes some abstract methods, and each lock needs to be customized to this manager. All the built-in concurrent data structures of JDK are completed under the protection of these locks. It is the foundation of JDK multithreaded high-rise buildings.

 

The lock manager only maintains an ordinary queue in the form of a two-way list. This data structure is very simple, but careful maintenance is quite complex, because it needs to carefully consider the problem of multithreading concurrency, and each line of code is written very carefully.

The implementer of JDK lock manager is Douglas S. Lea, who wrote almost all the Java contracts alone. In the world of algorithms, the more sophisticated things are, the more suitable they are for one person to do.

Douglas S. Lea is a professor of computer science at the State University of New York at Oswego and the current director of the Department of computer science, specializing in concurrent programming and the design of concurrent data structures. He is a member of the Executive Committee of the Java Community Process and chairs JSR 166, which adds concurrency utilities to the Java programming language.

 

Later, we will abbreviate AbstractQueuedSynchronizer to AQS. I must remind readers that AQS is too complex. It is normal to encounter setbacks on the way to understand it. At present, there is no book on the market that can easily understand AQS. There are too few people who can understand AQS, and I don't count myself.

Fair lock and unfair lock

The fair lock will ensure the order of requesting and obtaining the lock. If the lock is in a free state at a certain point, a thread wants to try to add the lock. The fair lock must also check whether there are other threads in the queue, rather than the fair lock can directly jump in the queue. Think of the queuing scene when buying hamburgers in KFC.

You may ask, if a lock is in a free state, how can it have queued threads? Let's assume that the thread holding the lock has just released the lock, and it wakes up the first node thread in the waiting queue. At this time, the awakened thread has just returned from the park method, and then it will try to lock. Then the state returned from the park to the lock is the free state of the lock, which is very short, In this short time, other threads may also be trying to lock.

Secondly, it should be noted that after the thread executing the Lock.park method sleeps itself, it does not have to wait until other threads unpark themselves. It may wake up for some unknown reason at any time. Let's look at the source code comments. There are four reasons why Park returns

  1. Other threads unpark the current thread
  2. Wake up naturally when the time is up (park has a time parameter)
  3. Other threads interrupt the current thread
  4. False awakening caused by other unknown reasons

The document does not specify what unknown causes false wake-up, but it does indicate that when the park method returns, it does not mean that the lock is free. The waking thread will park itself again after re trying to obtain the lock fails. Therefore, the process of locking needs to be written in a loop. Many attempts may be made before successfully getting the lock.

The service efficiency of unfair locks in the computer world is higher than that of fair locks, so Java default locks use unfair locks. However, in the real world, it seems that the efficiency of unfair locks will be a little worse. For example, if you can keep jumping in the queue in KFC, you can imagine that the scene must be in chaos. The reason why the computer world is different from the real world is probably because in the computer world, jumping in the queue of one thread does not cause other threads to complain.

public ReentrantLock() {
    this.sync = new NonfairSync();
}

public ReentrantLock(boolean fair) {
    this.sync = fair ? new FairSync() : new NonfairSync();
}
Copy code

Shared lock and exclusive lock

The ReentrantLock lock lock is an exclusive lock. One thread holds it, and other threads must wait. The read lock in ReadWriteLock is not an exclusive lock. It allows multiple threads to hold the read lock at the same time. This is a shared lock. Shared locks and exclusive locks are distinguished by the nextWaiter field in the Node class.

class AQS {
  static final Node SHARED = new Node();
  static final Node EXCLUSIVE = null;

  boolean isShared() {
    return this.nextWaiter == SHARED;
  }
}
Copy code

Then why isn't this field named mode or type or simply shared? This is because nextWaiter has different uses in other scenarios. It is as flexible as the field of C language union type, but Java language has no union type.

Conditional variable

As for conditional variables, the first question that needs to be raised is why conditional variables are needed. Is lock not enough? Consider the following pseudo code to do something when a condition is met

 void doSomething() {
   locker.lock();
   while(!condition_is_true()) {  // Let's see if we can do something first
     locker.unlock();  // If you can't do it, take a break and see if you can do it
     sleep(1);
     locker.lock(); // It needs to be locked to make trouble, and it also needs to be locked to judge whether it can make trouble
   }
   justdoit();  // Make trouble
   locker.unlock();
 }
Copy code

When the conditions are not met, it will retry in a loop (other threads will modify the conditions by locking), but the interval sleep is required, otherwise the CPU will soar due to idling. There is a problem here, that is, how long sleep is hard to control. If the interval is too long, the overall efficiency will be slowed down, and even the opportunity will be missed (the conditions are met and reset immediately). If the interval is too short, the CPU will idle again. With conditional variables, this problem can be solved

void doSomethingWithCondition() {
  cond = locker.newCondition();
  locker.lock();
  while(!condition_is_true()) {
    cond.await();
  }
  justdoit();
  locker.unlock();
}
Copy code

The await() method will block on the cond condition variable until another thread calls the cond.signal() or cond.signalAll() method. When await() blocks, it will automatically release the lock held by the current thread. After await() is awakened, it will try to hold the lock again (it may need to queue again). The await() method will not return successfully until the lock is obtained.

 

There can be multiple threads blocking on condition variables, and these blocked threads will be concatenated into a condition waiting queue. When signalAll() is called, it will wake up all blocking threads and let all blocking threads start competing for locks again. If signal() is called, it will only wake up the thread at the head of the queue, which can avoid the "group panic problem".

The await() method must release the lock immediately, otherwise the critical zone state cannot be modified by other threads_ is_ The result returned by true () will not change. This is why the condition variable must be created by the lock object. The condition variable needs to hold the reference of the lock object so that the lock can be released and re locked after being awakened by signal. The lock that creates the condition variable must be an exclusive lock. If the shared lock is released by the await() method, it does not guarantee that the state of the critical area can be modified by other threads. Only the exclusive lock can modify the state of the critical area. This is why the newCondition method of ReadWriteLock.ReadLock class is defined as follows

public Condition newCondition() {
    throw new UnsupportedOperationException();
}
Copy code

With conditional variables, the problem that sleep is not easy to control is solved. When the conditions are met, call the signal() or signalAll() method, and the blocked thread can be awakened immediately without any delay.

Conditional waiting queue

When multiple threads await() are on the same condition variable, a condition waiting queue will be formed. If multiple condition variables can be created for the same lock, there will be multiple condition waiting queues. This queue is very similar to the AQS queue structure, but it is not a two-way queue, but a one-way queue. The nodes in the queue are the same class as the nodes in the AQS waiting queue, but the node pointer is not prev and next, but nextWaiter.

class AQS {
  ...
  class ConditionObject {
    Node firstWaiter;  // Point to the first node
    Node lastWaiter;  // Point to the second node
  }
  
  class Node {
    static final int CONDITION = -2;
    static final int SIGNAL = -1;
    Thread thread;  // Currently waiting threads
    Node nextWaiter;  // Point to the next conditional waiting node
  
    Node prev;
    Node next;
    int waitStatus;  // waitStatus = CONDITION
  }
  ...
}

Copy code

 

ConditionObject is the internal class of AQS. There will be a hidden pointer this this $0 in this object to the external AQS object. ConditionObject can directly access all properties and methods of AQS object (lock and unlock). The waitStatus status of all nodes in the CONDITION waiting queue is marked as CONDITION, indicating that the node is waiting because of the CONDITION variable.

Queue transfer

When the signal() method of the CONDITION variable is called, the thread of the head node of the CONDITION waiting queue will be awakened. The node will be removed from the CONDITION waiting queue, and then transferred to the AQS waiting queue, ready to queue and try to obtain the lock again. At this time, the state of the node changes from CONDITION to SIGNAL, indicating that the current node is awakened and transferred by the CONDITION variable.

class AQS {
  ...
  boolean transferForSignal(Node node) {
    // Reset node status
    if (!node.compareAndSetWaitStatus(Node.CONDITION, 0))
      return false
    Node p = enq(node); // Enter AQS waiting queue
    int ws = p.waitStatus;
    // Then modify the status to SIGNAL
    if (ws > 0 || !p.compareAndSetWaitStatus(ws, Node.SIGNAL))
       LockSupport.unpark(node.thread);
       return true;
  }
  ...
}
Copy code

The meaning of the nextWaiter field of the transferred node has also changed. In the condition queue, it is the pointer to the next node, and in the AQS waiting queue, it is the flag of shared lock or mutex lock.

 

Java and common class library dependency structures

ReentrantLock locking process

Next, we will analyze the locking process in detail and deeply understand the lock logic control. I must be sure that Doug Lea's code is written in the following minimalist form, which is difficult to read.

class ReentrantLock {
    ...
    public void lock() {
        sync.acquire(1);
    }
    ...
}

class Sync extends AQS {
  ...
  public final void acquire(int arg) {
    if (!tryAcquire(arg) &&
      acquireQueued(addWaiter(Node.EXCLUSIVE), arg))
         selfInterrupt();
  }
  ...
}
Copy code

The if judgment statement of acquire is divided into three parts. The tryAcquire method indicates that the current thread attempts to lock. If the locking is unsuccessful, it needs to queue. At this time, the addWaiter method is called to queue the current thread. Then call the acquirequeueueued method to start the park, wake up and retry locking. If the locking is unsuccessful, continue the circular retry locking process of park. The acquire method will not return until the locking is successful.

The acquirequeueueueueued method returns true if it is interrupted by other threads during the loop retry locking process. At this time, the thread needs to call the selfInterrupt() method to set a broken identification bit for the current thread.

// To interrupt the current thread is to set an identification bit
static void selfInterrupt() {
        Thread.currentThread().interrupt();
}
Copy code

How does a thread know that it is interrupted by another thread? After park wakes up, you call Thread.interrupted(), you know, but this method can only be called once, because it immediately breaks the flag bit by clear after it is called. This is why selfInterrupt() needs to be called in the acquire method to reset the interrupt flag bit. In this way, the upper logic can know whether it has been interrupted through Thread.interrupted().

Acquirequeueueueued and addWaiter methods are provided by the AQS class, and tryAcquire needs to be implemented by the subclass itself. Different locks have different implementations. Let's take a look at the implementation of the fair lock tryAcquire method of ReentrantLock

 

Here is an if else branch. The else if part indicates the re-entry of the lock. The thread currently trying to lock is the thread that already holds the lock, that is, the same thread repeatedly locks. At this time, you only need to increase the count value. The lock state records the lock count, which is + 1 for one re-entry. There is an exclusiveOwnerThread field in the AQS object, which records the thread currently holding the exclusive lock.

if(c == 0) means that the current lock is free and the count value is zero. At this time, you need to compete for the lock, because multiple threads may call tryAcquire at the same time. The way of contention is to use CAS to operate compareAndSetState. The thread that successfully changes the lock count value from 0 to 1 will obtain the lock and record the current thread in exclusiveOwnerThread.

There is also a hasQueuedPredecessors() judgment in the code. This judgment is very important. It means to see if there are other threads in the current AQS waiting queue. The fair lock needs to be checked before locking. If there is a queue, you can't jump in the queue. The non fair lock does not need a check. This is the difference between the implementation of fair lock and non fair lock. This check determines whether the lock is fair or not.

Next, let's look at the implementation of the addWaiter method. The parameter mode indicates whether it is a shared lock or an exclusive lock, which corresponds to the Node.nextWaiter attribute.

 

 

 

addWaiter needs to add a new node to the end of the AQS waiting queue. If the tail at the end of the queue is empty, which means that the queue has not been initialized, it needs to be initialized. AQS queue needs a redundant header node during initialization, and the thread field of this node is empty.

Adding a new node to the end of the queue also needs to consider multi-threaded concurrency, so the CAS operation compareandsettiil is used again in the code to compete for the end of the queue pointer. Threads that do not compete will continue to compete in the next round for(;) Continue to use CAS operations to add new nodes to the end of the queue.

Next, let's take a look at the code implementation of the acquireQueue method. It will repeat the cycle of park, attempt to lock again, and continue the park if locking fails.

 

acquireQueue will check whether it is the first node of the AQS waiting queue before attempting to lock. If it is not, it will continue to park. This means that both fair and unfair locks adopt a fair scheme to see if it is their turn in the queue. In other words, "once in line, always in line.".

private final boolean parkAndCheckInterrupt() {
    LockSupport.park(this);
    return Thread.interrupted();
}
Copy code

After the thread wakes up after park returns, it should immediately check whether it is interrupted by other threads. However, even if an interrupt occurs, it will continue to try to obtain the lock. If it cannot be obtained, it will continue to sleep until the lock is obtained. This means that interrupting a thread does not cause a deadlock state (no lock) to exit.

At the same time, we can also notice that the lock can be cancelled. cancelAcquire(), to be exact, is in the state of waiting for locking, and the thread is in the AQS waiting queue waiting for locking. Under what circumstances will exceptions be thrown to cancel locking? The only possibility is the tryAcquire method, which is implemented by subclasses, and the behavior of subclasses is not controlled by AQS. When the tryAcquire method of a subclass throws an exception, the best way for AQS is to cancel locking. cancelAcquire removes the current node from the waiting queue.

ReentrantLock unlock process

The unlocking process is simpler. After the lock count is reduced to zero, wake up the first valid node in the waiting queue.

public final boolean release(int arg) {
    if (tryRelease(arg)) {
        Node h = head;
        if (h != null && h.waitStatus != 0)
            unparkSuccessor(h);
         return true;
     }
     return false;
}

protected final boolean tryRelease(int releases) {
    int c = getState() - releases;
    // Whoever unties the bell must tie it
    if (Thread.currentThread() != getExclusiveOwnerThread())
        throw new IllegalMonitorStateException();
    boolean free = false;
    if (c == 0) {
        free = true;
        setExclusiveOwnerThread(null);
    }
    setState(c);
    return free;
}
Copy code

Considering the reentrant lock, it is necessary to judge whether the lock count is reduced to zero to determine whether the lock is completely released. Only when the lock is completely released can the subsequent waiting node wake up. Unparkwinner will skip invalid nodes (canceled nodes), find the first valid node, and call unpark() to wake up the corresponding thread.

Read write lock

Read / write locks are divided into two lock objects, ReadLock and WriteLock, which share the same AQS. The lock count variable state of AQS will be divided into two parts. The first 16bit is the ReadLock count of shared lock, and the last 16bit is the WriteLock count of mutex lock. The mutex records the number of reentries of the current write lock, and the shared lock records the total number of reentries of all threads currently holding the shared read lock.

Read / write locks also need to consider fair locks and unfair locks. The fair lock strategy of shared lock and mutex lock is the same as that of ReentrantLock, which is to see if there are other threads queuing and they will queue to the end of the queue. The unfair lock strategy is different. It tends to provide more opportunities for writing locks. If any thread of read-write request is queued in the current AQS queue, the write lock can be competed directly. However, if the queue head is a write lock request, the read lock needs to give the opportunity to the write lock and queue at the end of the queue. After all, read-write locks are suitable for situations where there are more reads and less writes. For an occasional write lock request, it should be processed with higher priority.

Write lock and lock process

The write lock locking of read-write lock is logically the same as ReentrantLock, except for the tryAcquire() method

public final void acquire(int arg) {
    if (!tryAcquire(arg) &&
      acquireQueued(addWaiter(Node.EXCLUSIVE), arg))
         selfInterrupt();
}

protected final boolean tryAcquire(int acquires) {
    Thread current = Thread.currentThread();
    int c = getState();
    int w = exclusiveCount(c);
    if (c != 0) {
         if (w == 0 || current != getExclusiveOwnerThread())
              return false;
         if (w + exclusiveCount(acquires) > MAX_COUNT)
              throw new Error("Maximum lock count exceeded");
         setState(c + acquires);
         return true;
     }
     if (writerShouldBlock() ||
           !compareAndSetState(c, c + acquires))
         return false;
     setExclusiveOwnerThread(current);
     return true;
}
Copy code

Reentry should also be considered for write locks. If the thread holding the current AQS mutex is exactly the thread to be locked, the write lock is reentrant, and the reentry only needs to increase the lock count value. When c= 0 means that when the lock count is not zero, it may be because the current AQS has a read lock or a write lock. Judging w == 0 means judging whether the current count is caused by a read lock.

If the count value is zero, it starts to compete for the lock. Depending on whether the lock is fair, the writerShouldBlock() method is called before the race to see if it needs queuing. If there is no need to queue, the CAS operation can be used to compete. The thread that successfully set the count from 0 to 1 will write exclusive locks.

Lock reading and locking process

The process of reading and locking is much more complex than that of writing. It is the same as that of writing, but there is a big gap in details. In particular, it needs to record the read lock count for each thread, which occupies a lot of code.

public final void acquireShared(int arg) {
    // If the attempt to lock is unsuccessful, go to queue to sleep, and then try again
    if (tryAcquireShared(arg) < 0)
        // Queuing, cyclic retry
        doAcquireShared(arg);
}
Copy code

If the current thread already holds a write lock, it can continue to add a read lock. This is the logic that must be supported in order to achieve lock degradation. Lock degradation refers to adding a read lock and unlocking a write lock when a write lock is held. Compared with writing and unlocking first and then adding a read lock, this can save the process of locking and secondary queuing. Because of the existence of lock degradation, the read-write count in the lock count can not be zero at the same time.

wlock.lock();
if(whatever) {
  // Demotion
  rlock.lock();
  wlock.unlock();
  doRead();
  rlock.unlock();
} else {
  // No degradation
  doWrite()
  wlock.unlock();
}
Copy code

In order to count locks for each lock reading thread, it sets a ThreadLocal variable.

private transient ThreadLocalHoldCounter readHolds;

static final class HoldCounter {
    int count;
    final long tid = LockSupport.getThreadId(Thread.currentThread());
}

static final class ThreadLocalHoldCounter
            extends ThreadLocal<HoldCounter> {
   public HoldCounter initialValue() {
        return new HoldCounter();
   }
}
Copy code

However, the ThreadLocal variable is not efficient enough to access, so the cache is set. It stores the lock count of the last thread to acquire a read lock. When the thread contention is not particularly frequent, it is more efficient to read the cache directly.

private transient HoldCounter cachedHoldCounter;
Copy code

Doug lea thinks that using cachedHoldCounter is not efficient enough, so he adds a layer of cache record firstReader to record the first thread that changes the read lock count from 0 to 1 and the lock count. When there is no thread contention, it is more efficient to read these two fields directly.

private transient Thread firstReader;
private transient int firstReaderHoldCount;

final int getReadHoldCount() {
    // Access the read count part of the lock global count first
    if (getReadLockCount() == 0)
        return 0;

    // Then visit firstReader
    Thread current = Thread.currentThread();
    if (firstReader == current)
         return firstReaderHoldCount;

    // Re access the most recent read thread lock count
    HoldCounter rh = cachedHoldCounter;
    if (rh != null && rh.tid == LockSupport.getThreadId(current))
        return rh.count;

    // But read ThreadLocal
    int count = readHolds.get().count;
    if (count == 0) readHolds.remove();
    return count;
}
Copy code

So we see that in order to record the read lock count, the author takes great pains. What is the function of the read count? That is, the thread can know whether it holds the read-write lock through this count value.

There is also a spin process in reading and locking. The so-called spin is the first lock failure. Then, it will directly cycle retry without hibernation. It sounds a bit like the dead cycle retry method.

final static int SHARED_UNIT = 65536
// The read count is 16 bits high

final int fullTryAcquireShared(Thread current) {
  for(;;) {
    int c = getState();
    // If there are other threads with write locks, you'd better go back to sleep
    if (exclusiveCount(c) != 0) {
        if (getExclusiveOwnerThread() != current)
            return -1;
    ...
    // Upper count limit exceeded
    if (sharedCount(c) == MAX_COUNT)
       throw new Error("Maximum lock count exceeded");
    if (compareAndSetState(c, c + SHARED_UNIT)) {
       // Got the read lock
       ...
       return 1
    }
    ...
    // Loop retry
  }
}
Copy code

Because the CAS operation needs to be used to modify the total read count value of the underlying lock, the successful one can obtain the read lock. The failure of the CAS operation to obtain the read lock only means that there is a CAS operation competition between the read locks, which does not mean that the lock is occupied by others and cannot be obtained. Try a few more times and you will succeed in locking. This is the reason for the spin. Similarly, there is a cyclic retry process of CAS operation when releasing the read lock.

protected final boolean tryReleaseShared(int unused) {
   ...
   for (;;) {
       int c = getState();
       int nextc = c - SHARED_UNIT;
       if (compareAndSetState(c, nextc)) {
         return nextc == 0;
       }
   }
   ...
}
Copy code

Tags: Java thread

Posted on Mon, 06 Sep 2021 21:48:55 -0400 by EmperorDoom