ByteBuffer implementation of incremental capacity allocation

Article Directory

Preface

For Java nio ByteBuffer, we often use it for buffering data.If we allocate a larger capacity ByteBuffer for each data read-write operation just for the sake of diagram convenience, this will result in unnecessary waste of JVM heaps.But if we switch to dynamic requests for multiple small ByteBuffers, it will increase the management coordination of ByteBuffers.So what can be done to combine the above two features so that neither the use of JVM heap space is wasted nor complex ByteBuffer logic is executed in the business.The author of this article describes the implementation of an incremental ByteBuffer implemented internally in Ozone.Incremental ByeBuffer is fully compatible with the ByteBuffe r native operating method for external use and the internal incremental allocate capacity operation is fully transparent to the caller.

Incremental ByteBuffer implementation inside Ozone

Here is a brief introduction to the background of Ozone, which is an object storage system. When storing object files, Ozone involves writing a large number of small data objects and storing them physically as Chunk files.In the process of reading and writing chunk data, Ozone uses ByteBuffer to store intermediate data.In the initial implementation, the capacities of the initial ByteBuffer allocate within Ozone are larger, regardless of how much data the user writes.To do this, the community later achieved a ByteBuffer named Incremental ChunkBuffer that was able to specify the target capacity and also the dynamic incremental allocation size.

The following is the implementation code for Incremental ChunkBuffer, extracted from the Ozone project:

/**
 * Use a list of {@link ByteBuffer} to implement a single {@link ChunkBuffer}
 * so that the buffer can be allocated incrementally.
 */
final class IncrementalChunkBuffer implements ChunkBuffer {
  /**
   * limit boundary value of global Buffer
   */
  private final int limit;
  /** ByteBuffer Incremental capacity. */
  private final int increment;
  /** BytesBuffer Subscript array boundary value. */
  private final int limitIndex;
  /** Incremental allocation ByteBuffer array */
  private final List<ByteBuffer> buffers;
  /** Is this a duplicated buffer? (for debug only) */
  private final boolean isDuplicated;
  
  // Incremental ByteBuffer total capacity (incoming limit value), each dynamic increment value size
  IncrementalChunkBuffer(int limit, int increment, boolean isDuplicated) {
    Preconditions.checkArgument(limit >= 0);
    Preconditions.checkArgument(increment > 0);
    // Initialize boundary value, incremental ByteBuffer value
    this.limit = limit;
    this.increment = increment;
    // Calculate maximum subscript of ByteBuffer array
    this.limitIndex = limit/increment;
    // Initial empty ByteBuffer array
    this.buffers = new ArrayList<>(limitIndex + (limit%increment == 0? 0: 1));
    this.isDuplicated = isDuplicated;
  }

  /** @return the capacity for the buffer at the given index. */
  private int getBufferCapacityAtIndex(int i) {
    Preconditions.checkArgument(i >= 0);
    Preconditions.checkArgument(i <= limitIndex);
    return i < limitIndex? increment: limit%increment;
  }

  private void assertInt(int expected, int computed, String name, int i) {
    ChunkBuffer.assertInt(expected, computed,
        () -> this + ": Unexpected " + name + " at index " + i);
  }

  /** @return the i-th buffer if it exists; otherwise, return null. */
  private ByteBuffer getAtIndex(int i) {
    Preconditions.checkArgument(i >= 0);
    Preconditions.checkArgument(i <= limitIndex);
    final ByteBuffer ith = i < buffers.size() ? buffers.get(i) : null;
    if (ith != null) {
      // assert limit/capacity
      if (!isDuplicated) {
        assertInt(getBufferCapacityAtIndex(i), ith.capacity(), "capacity", i);
      } else {
        if (i < limitIndex) {
          assertInt(increment, ith.capacity(), "capacity", i);
        } else if (i == limitIndex) {
          assertInt(getBufferCapacityAtIndex(i), ith.limit(), "capacity", i);
        } else {
          assertInt(0, ith.limit(), "capacity", i);
        }
      }
    }
    return ith;
  }

  /** @return the i-th buffer. It may allocate buffers. */
  private ByteBuffer getAndAllocateAtIndex(int index) {
    Preconditions.checkArgument(index >= 0);
    // never allocate over limit
    if (limit % increment == 0) {
      Preconditions.checkArgument(index < limitIndex);
    } else {
      Preconditions.checkArgument(index <= limitIndex);
    }

    int i = buffers.size();
    if (index < i) {
      return getAtIndex(index);
    }

    // allocate upto the given index
    ByteBuffer b = null;
    for (; i <= index; i++) {
      b = ByteBuffer.allocate(getBufferCapacityAtIndex(i));
      buffers.add(b);
    }
    return b;
  }

  /** @return the buffer containing the position. It may allocate buffers. */
  private ByteBuffer getAndAllocateAtPosition(int position) {
    Preconditions.checkArgument(position >= 0);
    Preconditions.checkArgument(position < limit);
    // Calculate the subscript that needs to get ByteBuffer
    final int i = position / increment;
    // Get the ByteBuffer corresponding to this subscript, if not created, create the ByteBuffer
    final ByteBuffer ith = getAndAllocateAtIndex(i);
    assertInt(position%increment, ith.position(), "position", i);
    return ith;
  }

  /** @return the index of the first non-full buffer. */
  private int firstNonFullIndex() {
    for (int i = 0; i < buffers.size(); i++) {
      if (getAtIndex(i).position() != increment) {
        return i;
      }
    }
    return buffers.size();
  }

  @Override
  public int position() {
    // The buffers list must be in the following orders:
    // full buffers, buffer containing the position, empty buffers, null buffers
    // 1) Find the first dissatisfied ByteBuffer subscript
    final int i = firstNonFullIndex();
    // 2) Get this ByteBuffer item
    final ByteBuffer ith = getAtIndex(i);
    // 3) Calculate the current global position, the length of the first full ByteBuffer + the position of the currently dissatisfied ByteBuffer
    final int position = i * increment + Optional.ofNullable(ith)
        .map(ByteBuffer::position).orElse(0);
    // remaining buffers must be empty
    assert assertRemainingList(ith, i);
    return position;
  }

  private boolean assertRemainingList(ByteBuffer ith, int i) {
    if (ith != null) {
      // buffers must be empty
      for (i++; i < buffers.size(); i++) {
        ith = getAtIndex(i);
        if (ith == null) {
          break; // found the first non-null
        }
        assertInt(0, ith.position(), "position", i);
      }
    }
    final int j = i;
    ChunkBuffer.assertInt(buffers.size(), i,
        () -> "i = " + j + " != buffers.size() = " + buffers.size());
    return true;
  }

  @Override
  public int remaining() {
    // The remaining operation is semantically identical to the original ByteBuffer
    return limit - position();
  }

  @Override
  public int limit() {
    return limit;
  }

  @Override
  public ChunkBuffer rewind() {
    buffers.forEach(ByteBuffer::rewind);
    return this;
  }

  @Override
  public ChunkBuffer clear() {
    buffers.forEach(ByteBuffer::clear);
    return this;
  }

  @Override
  public ChunkBuffer put(ByteBuffer that) {
    // 1) Determine if the remaining data space in the ByteBuffer to put operation exceeds the current remaining space in the incremental ByteBuffer
    // If it exceeds, a BufferOverflowException exception is thrown
    if (that.remaining() > this.remaining()) {
      final BufferOverflowException boe = new BufferOverflowException();
      boe.initCause(new IllegalArgumentException(
          "Failed to put since that.remaining() = " + that.remaining()
              + " > this.remaining() = " + this.remaining()));
      throw boe;
    }

    // 2) Get the boundary value to be written to ByteBuffer
    final int thatLimit = that.limit();
    for(int p = position(); that.position() < thatLimit;) {
      // 3) Get the ByteBuffer that is providing write operations within the current incremental Buffer
      final ByteBuffer b = getAndAllocateAtPosition(p);
      // 4) Compare the size of the remaining data space and select the smaller values of the above two Buffer s
      final int min = Math.min(b.remaining(), thatLimit - that.position());
      that.limit(that.position() + min);
      // Write ByteBuffer
      b.put(that);
      // Update the current position of the incremental ByteBuffer
      p += min;
    }
    return this;
  }

  @Override
  public ChunkBuffer duplicate(int newPosition, int newLimit) {
    Preconditions.checkArgument(newPosition >= 0);
    Preconditions.checkArgument(newPosition <= newLimit);
    Preconditions.checkArgument(newLimit <= limit);
    final IncrementalChunkBuffer duplicated = new IncrementalChunkBuffer(
        newLimit, increment, true);

    final int pi = newPosition / increment;
    final int pr = newPosition % increment;
    final int li = newLimit / increment;
    final int lr = newLimit % increment;
    final int newSize = lr == 0? li: li + 1;

    for (int i = 0; i < newSize; i++) {
      final int pos = i < pi ? increment : i == pi ? pr : 0;
      final int lim = i < li ? increment : i == li ? lr : 0;
      duplicated.buffers.add(duplicate(i, pos, lim));
    }
    return duplicated;
  }

  private ByteBuffer duplicate(int i, int pos, int lim) {
    final ByteBuffer ith = getAtIndex(i);
    Objects.requireNonNull(ith, () -> "buffers[" + i + "] == null");
    final ByteBuffer b = ith.duplicate();
    b.position(pos).limit(lim);
    return b;
  }

  /** Support only when bufferSize == increment. */
  @Override
  public Iterable<ByteBuffer> iterate(int bufferSize) {
    if (bufferSize != increment) {
      throw new UnsupportedOperationException(
          "Buffer size and increment mismatched: bufferSize = " + bufferSize
          + " but increment = " + increment);
    }
    return asByteBufferList();
  }

  @Override
  public List<ByteBuffer> asByteBufferList() {
    return Collections.unmodifiableList(buffers);
  }

  @Override
  public long writeTo(GatheringByteChannel channel) throws IOException {
    return channel.write(buffers.toArray(new ByteBuffer[0]));
  }

  @Override
  public ByteString toByteStringImpl(Function<ByteBuffer, ByteString> f) {
    return buffers.stream().map(f).reduce(ByteString.EMPTY, ByteString::concat);
  }

  @Override
  public boolean equals(Object obj) {
    if (this == obj) {
      return true;
    } else if (!(obj instanceof IncrementalChunkBuffer)) {
      return false;
    }
    final IncrementalChunkBuffer that = (IncrementalChunkBuffer)obj;
    return this.limit == that.limit && this.buffers.equals(that.buffers);
  }

  @Override
  public int hashCode() {
    return buffers.hashCode();
  }

  @Override
  public String toString() {
    return getClass().getSimpleName()
        + ":limit=" + limit + ",increment=" + increment;
  }
}

Quote

[1].https://github.com/apache/hadoop-ozone/blob/master/hadoop-hdds/common/src/main/java/org/apache/hadoop/ozone/common/IncrementalChunkBuffer.java

389 original articles published, 425 praised, 2.07 million visits+
His message board follow

Tags: Java Hadoop jvm Apache

Posted on Wed, 11 Mar 2020 20:53:03 -0400 by Biocide