Phaser Layered Raster Processor

outline A synchronization tool class introduced by JDK1.7 for some tasks that require phasing. It has some functionality...
outline

A synchronization tool class introduced by JDK1.7 for some tasks that require phasing. It has some functionality similar to Cyclic Barrier and CountDownLatch, similar to a multistage fence, and more powerful
Processing for tasks that require phasing

SynchronizerEffectCountDownLatchReciprocal counter, which is initially set to the counter value, allows the thread to wait on the counter, and when the counter value returns to zero, all waiting threads continue to executeCyclicBarrierLoop the fence, initially setting the number of participating threads. When a thread reaches the fence, it waits for other threads to arrive. When the total number of arriving fences meets the specified number, all waiting threads continue to execute, and the fence can be reset to count from the beginningPhaserMulti-stage fence, which can set the number of participating threads initially or register/unregister participants halfway, will be advance d when the number of arriving participants meets the number set by the fence

In CyclicBarrier, there is only one fence where threads wait for other threads to arrive.

Phaser also has a fence. In Phaser, the name of the fence is called phase. At any point in time, Phaser is only in a phase, with an initial stage of 0, up to Integerr.MAX_VALUE, and then zero again. When all the parties participants arrive, the phase value increases.

Parties are actually the concept of participating threads in CyclicBarrier.

Participants in CyclicBarrier cannot be changed after the initial construction is specified, while Phaser can specify the number of participants at the initial construction, or register/unregister participants halfway through registers, bulkRegister s, arriveAndDeregister, and so on.

After Phaser has registered the parties, the initial state of the participants is unarrived, and when the participants reach the current phase, the state becomes arrived. When the number of participants arriving at the stage meets the criteria (the number of registered participants equals the number of arrivals), an advance d stage occurs - that is, the phase value + 1.



Represents the current Phaser object reaching the Termination termination state, somewhat similar to the concept of fence destruction in CyclicBarrier.

public class PhaserTest { public static void main(String[] args) throws InterruptedException { Phaser phaser = new Phaser(); Phaser phaser1 = new Phaser(phaser,4);//Stage 1 2 sub-phaser s Phaser phaser2 = new Phaser(phaser,6); for (int i = 0; i < 10; i++) { new Thread(new Task(i< 4? phaser1:phaser2), "Thread-" + i).start(); } //Block and wait for the thread to arrive until phase jumps to the next generation int i = phaser.awaitAdvance(phaser1.getPhase()); System.out.println("Belongs to"+i+"stage"); System.out.println("Finished transition"); phaser.arriveAndDeregister();//Arrive and cancel a participant phaser.arriveAndDeregister();//Arrive and cancel a participant System.out.println(phaser.isTerminated());//The fence has been destroyed and ended } } class Task implements Runnable { private final Phaser phaser; Task(Phaser phaser) { this.phaser = phaser; } @Override public void run() { System.out.println(Thread.currentThread().getName() + ": Execute tasks, belong to children phaser :" + phaser); phaser.arriveAndAwaitAdvance(); // The phaser at the current stage waits for other participant threads to arrive // phaser.arrive();//Wait for other threads to arrive without blocking //phaser.awaitAdvance(phaser.arriveAndDeregister()); //Unregistering the current thread and blocking it until it jumps to the next generation will cause the phaser fence to be destroyed System.out.println(Thread.currentThread().getName() + ": Completed the task, current stage to jump to :" + phaser.getPhase()); } }
Layered

Phaser supports Tiering, a tree structure that allows you to specify the parent node of the Phaser object that is currently being constructed. Tiering was introduced because when a Phaser has a large number of ** parties** Internal synchronization can dramatically degrade performance, while hierarchy reduces competition and reduces additional overhead due to synchronization.

In a hierarchical Phasers tree structure, registration and revocation of child or parent Phasers are automatically managed. When the number of **parties** for a Phaser becomes zero, if the Phaser has a parent node, it is removed from the parent node.

  1. The root node of the tree, root, links two "unlocked stacks" - Treiber Stack, which are used to hold waiting threads (for example, when a thread waits for Phaser to move to the next stage, it hangs itself on a stack based on the parity of the current stage), and all Phaser objects share both stacks.
  2. When a Phaser node is first linked to the tree, a participant is registered with the parent node of the node at the same time. Registration and revocation only affect internal count s; deeper internal records are not created, so the task cannot query whether they have been registered.

For an isolated Haser node (or a tree with only one root node)
The participants they wait for are explicitly registered participants, which is also the most common case.
For example, in the following figure, if 10 Task threads share this Phaser, the number of participants waiting is 10. When all 10 threads arrive, Phaser moves to the next stage


For a non-isolated Haser leaf node, such as the green leaf node shown in the image below
This is the same as (1) the number of participants waiting for sub-Phaser1 and sub-Phaser2 is 4, and the number of participants waiting for sub-Phaser3 is 2

For a non-isolated non-leaf Phaser node, such as the one shown above in blue
This is the most special case, and it is also the main design idea of Phaser synchronizer about layering.
In this case, the number of participants a node waits for consists of two parts:

  1. Participants who are directly explicitly registered (via the constructor or register method). - Equal to 0
  2. Number of child nodes. - Equal to 3

That is, in the figure above, when all four participants of the left-hand child Phaser1 arrive, it will inform the parent node Phaser that their status is OK. Phaser will then assume that the child Phaser1 is ready and will add one to its number of arrivals. Similarly, when all participants of the child Phaser2 and child Phaser3 arrive, they will inform Phaser in turn, only if PhaserWhen the number of arrivals (root node) is 3, the threads waiting in the "unlocked stack" are released and the phase number is increased by 1.

This is a layer-by-layer recursive design in which all waiting threads are allowed to pass until all participants in the root node have arrived (that is, the number of arriving parameters equals the number of its child nodes), and the fence moves on to the next stage.

Synchronization mechanism

Phaser can also repeat await. The method arriv.0eAndAwaitAdvance works like CyclicBarrier.await. Each generation of phaser has a related phase number with an initial value of 0. When all registered tasks reach phaser, phase+1 reaches the maximum value (Integer.MAX_VALUE) Use phase number to independently control the movement to reach phaser and wait for other threads, using the following two types of methods:

  • Number; that is, phase number is used to determine the arrival state. When all tasks reach a given phase, an optional function can be executed that is implemented by overriding the onAdvance method, which is often used to control the termination state. Rewriting this method is similar to providing a barrierAction for CyclicBarrier, but more flexible than it is.
  • Waiting (Waiting mechanism)The awaitAdvance method requires a representation of arrival phaseNumber parameter and returns when the phaser advances to a different phase than the given phase. Unlike Cyclic Barrier, the awaitAdvance method waits even if the waiting thread has been interrupted. The interrupt state and timeout are also available, but exceptions are encountered when the task waits for an interrupt or timeout without changing the phaser's state. If necessary, forceTermination s are used in the method.The on can then perform recovery operations on the handler s associated with these exceptions, or the Phaser may be used by tasks in ForkJoinPool, so that sufficient parallelism can be guaranteed to execute tasks while other tasks are blocking waiting for a phase.

Phaser uses a long type to hold the synchronization status value State and bitwise divides the meaning of different zones, using masking and bitwise operations

Child Phaser's phase is allowed to lag behind its root node before it is actually used

  • Low 0-15 bits indicate the number of parties not reached;
  • The 16-31 bits indicate the number of parties waiting;
  • 32-62 bits represent the current generation of phase;
  • A high 63 bits indicates the termination state of the current phaser
private volatile long state;//Synchronization status value unarrived: unreached participant tree (0-15 bits) total parties participant (16-32 bits) phase current stage //32-62 bits terminated 63 bits; empty at state=1 indicates no participant has registered at this time, unarrived=0 when all threads arrive private static final int MAX_PARTIES = 0xffff;//Maximum Participants private static final int MAX_PHASE = Integer.MAX_VALUE;//Maximum Phase Number private static final int PARTIES_SHIFT = 16;//Number of participant shifts private static final int PHASE_SHIFT = 32;//Offset of current stage private static final int UNARRIVED_MASK = 0xffff; // Mask of incomplete participants, 16 bits low private static final long PARTIES_MASK = 0xffff0000L; // Number of participants, 16 in the middle private static final long COUNTS_MASK = 0xffffffffL;//Get a low 32-bit value //The highest bit is 1, meaning the phaser needs to be terminated private static final long TERMINATION_BIT = 1L << 63 private static final int ONE_PARTY = 1 << PARTIES_SHIFT;//1 participant arrived private static final int ONE_DEREGISTER = ONE_ARRIVAL|ONE_PARTY;//Log off a participant private static final int EMPTY = 1;//State initial state value, used when there are no participants

State, state variable, high 32-bit storage current stage, middle 16-bit storage participants, low 16-bit storage incomplete participants

Stack Node

Treiber Stack, the unlocked stack, is saved in the root node of the Phaser tree and shared by all other Phaser subnodes:

// EveQ and oddQ are two "unlocked stacks"--Treiber Stack's top pointer to the stack, PHase uses a lock-free stack to save waiting threads to wait for other phase participants to reach the transition//to the next stage EveQ even stack oddQ is an odd stack, used alternately with even and odd phases internally, will be mounted at the root Phase of the current sub-phase private final AtomicReference <QNode> evenQ;Private final AtomicReference <QNode> oddQ;

Thread information and Phsaer object information are stored internally

static final class QNode implements ForkJoinPool.ManagedBlocker { final Phaser phaser; final int phase; final boolean interruptible; final boolean timed; boolean wasInterrupted; long nanos; final long deadline; volatile Thread thread; // nulled to cancel wait QNode next; QNode(Phaser phaser, int phase, boolean interruptible, boolean timed, long nanos) { this.phaser = phaser; this.phase = phase; this.interruptible = interruptible; this.nanos = nanos; this.timed = timed; this.deadline = timed ? System.nanoTime() + nanos : 0L; thread = Thread.currentThread(); } public boolean isReleasable() { if (thread == null) return true; if (phaser.getPhase() != phase) { thread = null; return true; } if (Thread.interrupted()) wasInterrupted = true; if (wasInterrupted && interruptible) { thread = null; return true; } if (timed && (nanos <= 0L || (nanos = deadline - System.nanoTime()) <= 0L)) { thread = null; return true; } return false; } public boolean block() { while (!isReleasable()) { if (timed) LockSupport.parkNanos(this, nanos); else LockSupport.park(this); } return true; }

ForkJoinPool.ManagedBlocker adds a worker thread inside ForkJoinPool to ensure parallelism when the stack contains a QNode block of type ForkJoinWorkerThread

constructor
public Phaser() { this(null, 0); } public Phaser(int parties) { this(null, parties); } public Phaser(Phaser parent) { this(parent, 0); } public Phaser(Phaser parent, int parties) //Register a new party public int register() //Bulk Registration public int bulkRegister(int parties) //Make the current thread reach phaser without waiting for other tasks to arrive. Return arrival phase number public int arrive() //Arrive the current thread at phaser and unregister it, returning arrival phase number public int arriveAndDeregister() /* * Making the current thread reach phaser and wait for other tasks to arrive is equivalent to awaitAdvance(arrive()). * If you need to wait for an interrupt or timeout, you can use the awaitAdvance method to complete a similar construction. * If you need to wait for a transition and unregister after arrival, you can use awaitAdvance(arriveAndDeregister()). */ public int arriveAndAwaitAdvance() //Wait for a given number of phases to return to the next arrival phase number public int awaitAdvance(int phase) //Block the wait until the phase advances to the next generation, returning the next generation's phase number public int awaitAdvance(int phase) //Response to interruption awaitAdvance public int awaitAdvanceInterruptibly(int phase) throws InterruptedException public int awaitAdvanceInterruptibly(int phase, long timeout, TimeUnit unit) throws InterruptedException, TimeoutException //Enter the current phaser into a termination state, registered parties are unaffected, or terminate all phasers if hierarchical public void forceTermination()

As you can see, Phaser(Phaser parent, int parties) is actually called as a constructor.

The internal implementation of Phaser(Phaser parent, int parties) is as follows. The key is to assign a parent node to the current Phaser object. If the current Phaser participant is not zero, you need to register a participant with the parent Phaser (representing the current node itself).

public Phaser(Phaser parent, int parties) { if (parties >>> PARTIES_SHIFT != 0) //Unsigned right shift of 16 bits without 0 to indicate that phasies exceed the maximum limit tree throw new IllegalArgumentException("Illegal number of parties"); int phase = 0;//Initially 0 this.parent = parent; if (parent != null) {//Parent Node Exists final Phaser root = parent.root; this.root = root;//The root pointer of the current Phase object points to the root node of the tree this.evenQ = root.evenQ;//Even number of unlocked stacks sharing parent nodes this.oddQ = root.oddQ;//Odd unlocked stack sharing parent node if (parties != 0)//Register a participant with the parent node Phaser if the current phaser participant is not 0 phase = parent.doRegister(1); } else { this.root = this;//If no parent node root exists, it points to itself this.evenQ = new AtomicReference<QNode>();//Create even number unlocked stack this.oddQ = new AtomicReference<QNode>();//Create an odd number unlocked stack } //Update Synchronization Status Value this.state = (parties == 0) ? (long)EMPTY : ((long)phase << PHASE_SHIFT) | ((long)parties << PARTIES_SHIFT) | ((long)parties); }
Registered Participants

Parties registered on the phaser change over time. Tasks can be registered at any time (using method register,bulkRegister registration, or the constructor determines the initial parties), and can be unregistered at any point of arrival (method arriveAndDeregister) Like most basic synchronization structures, registration and revocation only affect internal count s; they do not create deeper internal records, so the task cannot query whether they have been registered

Phaser provides two ways to register participants:

  • Register: register a single participant
  • bulkRegister: Register participants in bulk

The doRegister method was called internally:

/** * Register the specified number of participants {#registrations} */ private int doRegister(int registrations) { // The value that the state should add, note that this is equivalent to increasing both parties and unarrived long adjust = ((long) registrations << PARTIES_SHIFT) | registrations; final Phaser parent = this.parent; int phase; for (; ; ) { long s = (parent == null) ? state : reconcileState(); // reconcileState() adjusts the State and root of the current Phaser int counts = (int) s; // The lower 32 bits of state, the values of parties and unarrived int parties = counts >>> PARTIES_SHIFT; // Number of participants int unarrived = counts & UNARRIVED_MASK; // Number of unreached if (registrations > MAX_PARTIES - parties) throw new IllegalStateException(badRegister(s)); phase = (int) (s >>> PHASE_SHIFT); // The current Phaser stage if (phase < 0) break; // Not the first participant if (counts != EMPTY) { // CASE1: Current Phaser has registered participants if (parent == null || reconcileState() == s) { //The number of unreached threads is 0, and all threads have arrived. During the transition, they wait for the advance to end through the internalAwaitAdvance spin //Then unarrived reverts to its original parties and goes to the else branch below to modify the state if (unarrived == 0) root.internalAwaitAdvance(phase, null); else if (STATE.compareAndSet(this, stateOffset, s, s + adjust)) // Otherwise, update the State directly break; } } else if (parent == null) { // CASE2: Current Phaser has no participants (first registration) and no parent node long next = ((long) phase << PHASE_SHIFT) | adjust; if (STATE.compareAndSet(this, stateOffset, s, next)) // CAS updates the State value of the current Phaser break; } else { // CASE3: Current Phaser has not registered a participant (first registration), has a parent node, multilevel registration synchronized (this) { if (state == s) { phase = parent.doRegister(1); // Register a participant with the parent node if (phase < 0) break; //Because in the same transaction, even if the phaser is terminated, registration will be completed while (!STATE.compareAndSet(this, stateOffset, s, ((long) phase << PHASE_SHIFT) | adjust)) { s = state; phase = (int) (root.state >>> PHASE_SHIFT); } break; } } } } return phase; }

The doRegister method registers participants for the current Phaser object in three main branches:

(1) The current Phaser has registered participants
If all the participants have reached the fence, the current thread needs to block and wait (because the phase is changing, increasing by 1 to the next phase) or update the State directly.

(2) Current Phaser has no participants (first registration) and no parent node
Update the State value of the current Phaser directly.

(3) Phaser is not currently registered as a participant (first registration) and has a parent node
This indicates that the current Phaser is a newly added leaf node, which needs to register itself with the parent node and update its State value.

Note: The reconcileState method is special because when a tree appears, the root node updates its phase first, so it needs to be explicitly synchronized to keep the current node and the root node consistent.

/* Adjust the current Phaser synchronization status value to be consistent with the root node; when a tree appears, the root node is the first phaser transition and needs to show synchronization to be consistent with the root node */ private long reconcileState() { final Phaser root = this.root; long s = state; if (root != this) { int phase, p; // Modify the current Phaser stage using the spin+CAS operation. Storage of the state variable state is divided into three sections while ((phase = (int)(root.state >>> PHASE_SHIFT)) != (int)(s >>> PHASE_SHIFT) && !STATE.weakCompareAndSet (this, s, s = (((long)phase << PHASE_SHIFT) | ((phase < 0) ? (s & COUNTS_MASK) : (((p = (int)s >>> PARTIES_SHIFT) == 0) ? EMPTY : ((s & PARTIES_MASK) | p)))))) s = state; } return s; }

Additionally, the blocked waiting call is the internalAwaitAdvance method, which wraps the thread into a node and adds it to a "lock-free stack" pointed to by the root node according to the current phase:

/** * internalAwaitAdvance The main logic is that the current participant thread waits for Phaser to move to the next stage (that is, change in the phase value). * @return Return to a new stage */ private int internalAwaitAdvance(int phase, QNode node) { // assert root == this; releaseWaiters(phase - 1); // Empty unused Treiber Stack s (alternating odd and even Stacks) boolean queued = false; // Entry ID int lastUnarrived = 0; int spins = SPINS_PER_ARRIVAL; long s; int p; //Check if the current stage changes, and if the description changes to the next stage, there is no need to spin while ((p = (int) ((s = state) >>> PHASE_SHIFT)) == phase) { if (node == null) { // spinning in noninterruptible mode int unarrived = (int) s & UNARRIVED_MASK;//Number of unfinished participants //unarrived changes, increasing the number of spins if (unarrived != lastUnarrived && (lastUnarrived = unarrived) < NCPU) spins += SPINS_PER_ARRIVAL; boolean interrupted = Thread.interrupted(); if (interrupted || --spins < 0) { // When the number of spins is complete, a new node is created node = new QNode(this, phase, false, false, 0L); node.wasInterrupted = interrupted; } } else if (node.isReleasable()) // done or aborted break; else if (!queued) { // Push Node on Top of Stack AtomicReference<QNode> head = (phase & 1) == 0 ? evenQ : oddQ; QNode q = node.next = head.get(); if ((q == null || q.phase == phase) && (int) (state >>> PHASE_SHIFT) == phase) // avoid stale enq queued = head.compareAndSet(q, node); } else { try { // Blocking Wait ForkJoinPool.managedBlock(node); } catch (InterruptedException ie) { node.wasInterrupted = true; } } } if (node != null) {//The thread is awakened if (node.thread != null) node.thread = null; // avoid need for unpark() if (node.wasInterrupted && !node.interruptible) Thread.currentThread().interrupt(); if (p == phase && (p = (int) (state >>> PHASE_SHIFT)) == phase) return abortWait(phase); // Delete nodes that no longer wait due to timeout or interruption } releaseWaiters(phase); // Wake up a blocked thread at the current stage return p; }
Participants arrive and wait

Making the current thread reach phaser and wait for other tasks to arrive is equivalent to awaitAdvance(arrive())

public int arriveAndAwaitAdvance() { // Specialization of doArrive+awaitAdvance eliminating some reads/paths final Phaser root = this.root; for (;;) { long s = (root == this) ? state : reconcileState(); int phase = (int)(s >>> PHASE_SHIFT); if (phase < 0) return phase; int counts = (int)s; int unarrived = (counts == EMPTY) ? 0 : (counts & UNARRIVED_MASK);//Get Unarrived Number if (unarrived <= 0) throw new IllegalStateException(badArrive(s)); if (STATE.compareAndSet(this, stateOffset, s, s -= ONE_ARRIVAL)) {//Update state if (unarrived > 1)//Block the current thread if there is still unreachable (that is, the current thread is not the last one to arrive) return root.internalAwaitAdvance(phase, null);//Block waiting for other tasks if (root != this)//If the current thread is the last one reached in the phaser stage, wait for the parent node to move to the next stage return parent.arriveAndAwaitAdvance();//Child Phaser handed over to parent node long n = s & PARTIES_MASK; // base of next state int nextUnarrived = (int)n >>> PARTIES_SHIFT;//Number of participants to arrive next time //onAdvance() method, returning true indicates that the number of participants in the next stage is zero, that is, the end if (onAdvance(phase, nextUnarrived))//All arrived, check if destroyable n |= TERMINATION_BIT; else if (nextUnarrived == 0) n |= EMPTY; else n |= nextUnarrived; int nextPhase = (phase + 1) & MAX_PHASE;//Compute next generation phase n |= (long)nextPhase << PHASE_SHIFT; if (!STATE.compareAndSet(this, stateOffset, s, n))//Update state return (int)(state >>> PHASE_SHIFT); // terminated releaseWaiters(phase);//Release the thread Phaser was waiting for in the previous stage return nextPhase;//Next Stage Value } } }

First reduce the number of unreachable participants in the synchronization status value State by 1, and then determine if the number of unreachable participants is zero?

If it is not zero, the current thread is blocked waiting for other participants to arrive;

If 0, the current thread is the last participant, and if there is a parent node, the method is called recursively on the parent node. (Because only if the number of unreached participants at the root node is 0), the phase is advanced.

Non-blocking participant arrival
//Make the current thread reach phaser without waiting for other tasks to arrive. Return arrival phase number public int arrive() { return doArrive(ONE_ARRIVAL); } private int doArrive(int adjust) { final Phaser root = this.root; for (;;) { long s = (root == this) ? state : reconcileState(); int phase = (int)(s >>> PHASE_SHIFT); if (phase < 0) return phase; int counts = (int)s; //Get Unarrived Number int unarrived = (counts == EMPTY) ? 0 : (counts & UNARRIVED_MASK); if (unarrived <= 0) throw new IllegalStateException(badArrive(s)); if (STATE.compareAndSet(this, stateOffset, s, s-=adjust)) {//Update state if (unarrived == 1) {//Currently the last unreachable task long n = s & PARTIES_MASK; // base of next state int nextUnarrived = (int)n >>> PARTIES_SHIFT;//Number of participants not reaching the next stage if (root == this) {//Indicates that there is no parent node and the root node is its phaser if (onAdvance(phase, nextUnarrived))//Check if phaser needs to be terminated n |= TERMINATION_BIT; else if (nextUnarrived == 0) n |= EMPTY; else n |= nextUnarrived; int nextPhase = (phase + 1) & MAX_PHASE; n |= (long)nextPhase << PHASE_SHIFT; STATE.compareAndSet(this, stateOffset, s, n); releaseWaiters(phase);//Release threads waiting for phase } //Hierarchical structure, using parent node to manage arrive else if (nextUnarrived == 0) { //Remove parent registration if the next stage does not reach 0 phase = parent.doArrive(ONE_DEREGISTER); STATE.compareAndSet(this, stateOffset, s, s | EMPTY); } else phase = parent.doArrive(ONE_ARRIVAL);//Set the arrive of the parent node to indicate arrival } return phase; } } }

awaitAdvance is used to block waiting threads until phase advances to the next generation, returning the next generation's phase number

public int awaitAdvance(int phase) { final Phaser root = this.root; long s = (root == this) ? state : reconcileState(); int p = (int)(s >>> PHASE_SHIFT); if (phase < 0) return phase; if (p == phase) return root.internalAwaitAdvance(phase, null); return p; }
Summary
  1. Phaser is suitable for multi-stage, multi-task scenarios, where each phase of the task can be controlled very finely;
  2. Phaser uses state variables and a lock-free stack internally to implement the entire logic.
  3. The state stores the current phase in 32 bits, with 16 bits storing the number of participants (tasks) in the current phase, 32-62 bits representing the current generation of phase, and the low 16 bits storing the number of uncompleted participants unarrived.
  4. The unlocked stack will select different unlocked stacks based on the parity of the current stage;
  5. When not the last participant arrives, it spins or queues to wait for all participants to complete the task.
  6. When the last participant completes the task, it wakes up the thread in the queue and moves to the next stage.

3 October 2021, 12:25 | Views: 7215

Add new comment

For adding a comment, please log in
or create account

0 comments