Some associations and conclusions when learning Disruptor

1, Foreword

  I ran into the Disruptor during the whole log4j2 two days ago. I was wondering if I could learn some ideas of performance optimization from this blown up concurrency framework. It's really rewarding to learn the source code a little.

2, RingBuffer

  after a general look, it seems that all points are related to RingBuffer, so write the title as RingBuffer.

  1. During initialization, the destroyer creates a RingBuffer with a specified size. After creation, the size is constant and will not be automatically expanded, avoiding the performance loss of frequent expansion; [you can think of List, Map, StringBuilder, StringBuffer and other automatic capacity expansion gadgets]
     
  2. The core of storing data in RingBuffer is the array variable of entries. When RingBuffer is initialized, it will pre allocate data [is it similar to some pooled things] to fill the array; The data content is a user-defined event object. This logic can be seen in the implementation of RingBufferFields, the parent class of RingBuffer:
RingBufferFields(
    EventFactory<E> eventFactory,
    Sequencer sequencer)
{
    this.sequencer = sequencer;
    this.bufferSize = sequencer.getBufferSize();
    if (bufferSize < 1)
    {
        throw new IllegalArgumentException("bufferSize must not be less than 1");
    }
    if (Integer.bitCount(bufferSize) != 1)
    {
        throw new IllegalArgumentException("bufferSize must be a power of 2");
    }
    this.indexMask = bufferSize - 1;
    this.entries = new Object[sequencer.getBufferSize() + 2 * BUFFER_PAD];
    // Preallocation object
    fill(eventFactory);
}
private void fill(EventFactory<E> eventFactory)
{
    for (int i = 0; i < bufferSize; i++)
    {
        entries[BUFFER_PAD + i] = eventFactory.newInstance();
    }
}
  1. Compared with other ring data structures, RingBuffer is implemented based on array [can make full use of CPU cache line, high performance]. There is no tail pointer, but only a next pointer, pointing to the next available location.
     
  2. The core idea of RingBuffer loop writing is to use the subscript of sequence and bufferSize for modular operation. The bit operation actually used [associate with the array positioning of HashMap]. In the binary environment, the bit operation has much higher performance than mod operation.
     
  3. RingBuffer will not actively clear the object information that has been consumed, but will only overwrite the previous data. The overwrite operation is actually the overwrite of the properties of the user-defined event object. Here, object reuse is realized and GC pressure is reduced.
     
  4. Disruptor makes full use of the powerful performance of CPU cache line. In order to solve the problem of pseudo sharing, you can see many long variables used to fill alignment in the frequently used Sequence class and RingBuffer class; When processing Sequence, set 7-bit padding long on the left and right respectively to ensure exclusive cache lines;
class LhsPadding
{
    protected long p1, p2, p3, p4, p5, p6, p7;
}

class Value extends LhsPadding
{
    protected volatile long value;
}

class RhsPadding extends Value
{
    protected long p9, p10, p11, p12, p13, p14, p15;
}

The same operation is performed in RingBuffer:

abstract class RingBufferPad
{
    protected long p1, p2, p3, p4, p5, p6, p7;
}

abstract class RingBufferFields<E> extends RingBufferPad
{
    private static final int BUFFER_PAD;
    private static final long REF_ARRAY_BASE;
    private static final int REF_ELEMENT_SHIFT;
    private static final Unsafe UNSAFE = Util.getUnsafe();
    ....
}

public final class RingBuffer<E> extends RingBufferFields<E> implements Cursored, EventSequencer<E>, EventSink<E>
{
    public static final long INITIAL_CURSOR_VALUE = Sequence.INITIAL_VALUE;
    protected long p1, p2, p3, p4, p5, p6, p7;
    ....
}
  1. There is also the big idea of Disruptor, non locking. Except that WaitStrategy involves ReentrantLock, there are basically no other lock operations in the Disruptor, or CAS. In a multithreaded environment, the more threads, the more intense the thread competition, and the greater the lock overhead. Here we can even think of some application implementations of lock free serialization, such as redis and netty.
    There are other similar lockless frameworks: java.util.concurrent.atomic package and Amino framework.

ending

  there's nothing to say. If you have some inspiration for your little friends, please pay attention to me ➕.

Tags: Java Optimize

Posted on Sun, 28 Nov 2021 14:06:04 -0500 by bivaughn