Detailed explanation of ArrayList of hanging interviewers

ArrayList overview

ArrayList implements the List interface, which is actually an array List. However, as a collection framework of Java, it can only store object reference types, that is, when the data we need to load is basic data types such as int and float, we must convert them into corresponding wrapper classes.

The underlying implementation of ArrayList is an Object array:

transient Object[] elementData;

Since it is implemented based on array and array is continuously allocated in memory space, the query speed must be very fast, but of course, it must not escape the defect of low efficiency of addition and deletion.

In addition, LinkedList, which implements the List interface like ArrayList, is commonly used. LinkedList is special. It implements not only the List interface, but also the Queue interface, so you can see that LinkedList is often used as a Queue:

Queue<Integer> queue = new LinkedList<>();

LinkedList is like its name. Its bottom layer is naturally based on linked list, and it is also a two-way linked list. The characteristics of the linked list and the array are exactly the opposite. Because there is no index, the query efficiency is low, but the addition and deletion speed is fast.

How does ArrayList specify the size of the underlying array

Since the place where we really store data is an array, we naturally allocate a size to the array and open up a memory space when initializing ArrayList. Let's first look at the parameterless constructor of ArrayList:

public ArrayList() {
    this.elementData = DEFAULTCAPACITY_EMPTY_ELEMENTDATA;
}
private static final Object[] DEFAULTCAPACITY_EMPTY_ELEMENTDATA = {};

As you can see, it assigns a default empty array DefaultAttribute to the underlying Object array, that is, elementData_ EMPTY_ elementData. That is, after initializing ArrayList with a parameterless constructor, its array capacity at that time is 0.

What's the use of initializing an array with a capacity of 0? Can't save anything? Don't worry. If the parameterless constructor is used to initialize the ArrayList, a default initial capacity default will be allocated to the array only when we really add the data_ Capability = 10, the source code is as follows:

public void add(int index, E element) {
        rangeCheckForAdd(index);
    	// First step
        ensureCapacityInternal(size + 1);  // Increments modCount!!
        System.arraycopy(elementData, index, elementData, index + 1, size - index);
        elementData[index] = element;
        size++;
}

private void ensureCapacityInternal(int minCapacity) {
    // Step 2
    ensureExplicitCapacity(calculateCapacity(elementData, minCapacity));
}
    
private static int calculateCapacity(Object[] elementData, int minCapacity) {
    if (elementData == DEFAULTCAPACITY_EMPTY_ELEMENTDATA) {
        // Step 3
        return Math.max(DEFAULT_CAPACITY, minCapacity);
    }
    return minCapacity;
}
// Initial capacity
private static final int DEFAULT_CAPACITY = 10;

After the parameterless construction, the parameterless constructor of ArrayList is in order. Open up the array space according to the size passed in by the user:

public ArrayList(int initialCapacity) {
    if (initialCapacity > 0) {
        this.elementData = new Object[initialCapacity];
    } else if (initialCapacity == 0) {
        this.elementData = EMPTY_ELEMENTDATA;
    } else {
        throw new IllegalArgumentException("Illegal Capacity: "+ initialCapacity);
    }
}

Capacity expansion mechanism

The underlying implementation of ArrayList is the Object array. We know that once the size of the array is specified, it cannot be changed. How does the ArrayList expand if we constantly add data to it? Or how does ArrayList store any number of objects?

OK, when did the expansion take place? That must be when we add a new element to the array, but find that the array is full. Yes, let's go to the add method to see how the ArrayList is expanded:

public void add(int index, E element) {
        rangeCheckForAdd(index);
    	// First step
        ensureCapacityInternal(size + 1);  // Increments modCount!!
        System.arraycopy(elementData, index, elementData, index + 1, size - index);
        elementData[index] = element;
        size++;
}

private void ensureCapacityInternal(int minCapacity) {
    // Step 2
    ensureExplicitCapacity(calculateCapacity(elementData, minCapacity));
}

private void ensureExplicitCapacity(int minCapacity) {
    modCount++;
    // overflow-conscious code
    if (minCapacity - elementData.length > 0)
        grow(minCapacity);
}

ensureExplicitCapacity judges whether capacity expansion is necessary. Obviously, the grow th method is the key to capacity expansion:

private void grow(int minCapacity) {
    // overflow-conscious code
    int oldCapacity = elementData.length;
    // Key steps
    int newCapacity = oldCapacity + (oldCapacity >> 1);
    if (newCapacity - minCapacity < 0)
        newCapacity = minCapacity;
    if (newCapacity - MAX_ARRAY_SIZE > 0)
        newCapacity = hugeCapacity(minCapacity);
    // minCapacity is usually close to size, so this is a win:
    elementData = Arrays.copyOf(elementData, newCapacity);
}

Look at the key steps of the above code to know how to expand: the expanded array length = current array length + current array length / 2. Finally, use the Arrays.copyOf method to directly copy the array in the original array. It should be noted that the Arrays.copyOf method will create a new array and then copy it.

Add data

We have just talked about the add method. Before adding data, we will first judge whether it needs to be expanded. The real operation of adding data is in the second half:

public void add(int index, E element) {
    rangeCheckForAdd(index);
    ensureCapacityInternal(size + 1);  // Increments modCount!!
    System.arraycopy(elementData, index, elementData, index + 1,
    size - index);
    elementData[index] = element;
    size++;
}

First, the meaning of the add(int index, E element) method is to insert the element at the specified index. For example, ArrayList.add(0, 3) means to insert element 3 in the header.

Let's take another look at the core System.arraycopy of the add method. This method has five parameters:

  • elementData: source array
  • index: where in the source array does the copy start
  • elementData: target array
  • index + 1: where in the target array is copied
  • size - index: the number of array elements in the source array to copy

Needless to say, the operation performance of inserting data into the specified location of ArrayList is very low, because it is even slower to open up new arrays and copy elements, if it involves capacity expansion. In addition, ArrayList also has a built-in add method that directly adds elements at the end. Instead of copying the array, just use size + +. This method should be the most commonly used:

public boolean add(E e) {    ensureCapacityInternal(size + 1);  // Increments modCount!!    elementData[size++] = e;    return true;}

Delete element

To delete an element is the remove method. The source code is as follows:

public E remove(int index) {    rangeCheck(index);    modCount++;    E oldValue = elementData(index);    int numMoved = size - index - 1;    if (numMoved > 0)        System.arraycopy(elementData, index+1, elementData, index, numMoved);    elementData[--size] = null; // clear to let GC do its work    return oldValue;}

It is also very simple, that is, copy the original array, put the elements from index + 1 to the end of the array in the original data on the index position of the new array, and overwrite the elements on the index, which gives you the feeling that you have been deleted, and the efficiency is also low.

Thread safety issues

Neither ArrayList nor LinkedList is thread safe. Let's take the add method of adding an element at the end as an example to see how the ArrayList thread is unsafe:

public boolean add(E e) {    ensureCapacityInternal(size + 1);  // Increments modCount!!    //  Key code elementdata [size + +] = E; return true;}

The above key code is not an atomic operation, but consists of two steps:

elementData[size] = e;size = size + 1;

There is certainly no problem when executing these two codes in a single thread, but when executing in a multithreaded environment, it may happen that the value added by one thread overwrites the value added by another thread. for instance:

  1. Assuming size = 0, we will add elements to the end of the array
  2. Thread a starts adding an element with a value of A. At this point, it performs the first operation, placing a at the position where the index of array elementData is 0
  3. Then thread B just starts to add an element with value B, and goes to the first step. At this time, the size value obtained by thread B is still 0, so it also places B at the position where the subscript of elementData is 0
  4. Thread A starts to increase the value of size, size = 1
  5. Thread B starts to increase the value of size, size = 2

In this way, after threads A and B are executed, the ideal situation should be size = 2, elementData[0] = A, elementData[1] = B. The actual situation becomes size = 2, elementData[0] = B (thread B covers the operation of thread A), and there is nothing at the position of subscript 1. And then, unless we use the set method to modify the value with the subscript of 1, this position will always be null, because when adding an element at the end, it will start at the position of size = 2.

The thread safe version of ArrayList is Vector. Its implementation is very simple, that is, all methods are added with synchronized:

public synchronized void addElement(E obj) {    modCount++;    ensureCapacityHelper(elementCount + 1);    elementData[elementCount++] = obj;}

Since it requires additional overhead to maintain synchronization locks, it is theoretically slower than ArrayList. Why use thread unsafe?

Because in most scenarios, queries are mostly used, and frequent additions and deletions will not be involved. If frequent additions and deletions are involved, you can use LinkedList, the underlying linked list implementation, to add and delete. If you have to be thread safe, use Vector. Of course, ArrayList is the most used in actual development. Although the thread is unsafe and the addition and deletion efficiency is low, the query efficiency is high.

quote:

https://mp.weixin.qq.com/s?__biz=MzI0NDc3ODE5OQ==&mid=2247486005&idx=1&sn=b46cc073886c334f21e5b78c021b9da3&chksm=e959df8dde2e569bacb9f22f90a1cc852aee4755b43be322535cf6186aec66ee9ee2192e3834&scene=178&cur_album_id=1683346627601743872#rd

Tags: Java set arraylist

Posted on Sun, 07 Nov 2021 23:43:55 -0500 by malcolmboston