ArrayList source code analysis

ArrayList source code analysis

The following code is based on Java 8

Introduction to ArrayList

Source code:

public class ArrayList<E> extends AbstractList<E>
        implements List<E>, RandomAccess, Cloneable, java.io.Serializable
{
    //......
}

The relationship between ArrayList and Collection is shown in the figure below. The implementation represents inheritance, and the dotted line represents the implementation interface:

  1. ArrayList is an array queue, which is equivalent to a dynamic array. Compared with arrays in Java, its capacity can grow dynamically. It inherits from AbstractList and implements the interfaces List, RandomAccess, clonable, java.io.Serializable.
  2. ArrayList inherits AbstractList and implements List. It is an array queue, which provides related functions such as addition, deletion, modification and traversal.
  3. ArrayList implements the RandmoAccess interface, which provides the function of random access. RandmoAccess is implemented by List in java to provide fast access for List. In ArrayList, we can quickly obtain the element object through the element serial number; This is fast random access.
  4. ArrayList implements the Cloneable interface, that is, it overrides the function clone(), which can be cloned.
  5. ArrayList implements the java.io.Serializable interface, which means that ArrayList supports serialization and can be transmitted through serialization.

Note: the operation in ArrayList is not thread safe! Therefore, it is recommended to use it in a single thread. In the case of multiple threads, you can choose CopyOnWriteArrayList or wrap it into a thread safe List using the Collections.synchronizedList method.

API for ArrayList

// API defined in Collection
boolean             add(E object)
boolean             addAll(Collection<? extends E> collection)
void                clear()
boolean             contains(Object object)
boolean             containsAll(Collection<?> collection)
boolean             equals(Object object)
int                 hashCode()
boolean             isEmpty()
Iterator<E>         iterator()
boolean             remove(Object object)
boolean             removeAll(Collection<?> collection)
boolean             retainAll(Collection<?> collection)
int                 size()
<T> T[]             toArray(T[] array)
Object[]            toArray()
    
// API defined in AbstractCollection
void                add(int location, E object)
boolean             addAll(int location, Collection<? extends E> collection)
E                   get(int location)
int                 indexOf(Object object)
int                 lastIndexOf(Object object)
ListIterator<E>     listIterator(int location)
ListIterator<E>     listIterator()
E                   remove(int location)
E                   set(int location, E object)
List<E>             subList(int start, int end)
    
// New API for ArrayList
Object               clone()
void                 ensureCapacity(int minimumCapacity)
void                 trimToSize()
void                 removeRange(int fromIndex, int toIndex)

Properties of ArrayList

The main properties of ArrayList are as follows:

//Serialization id
private static final long serialVersionUID = 8683452581122892189L;

//Container default initialization size
private static final int DEFAULT_CAPACITY = 10;

//An empty object
private static final Object[] EMPTY_ELEMENTDATA = {};

//An empty object. If you create an ArrayList using the default constructor, the default object content is this value
private static final Object[] DEFAULTCAPACITY_EMPTY_ELEMENTDATA = {};

//ArrayList is the container for storing objects. Subsequent operations such as adding and deleting are based on this attribute
transient Object[] elementData;

//Length used by the current list
private int size;

//Maximum array length (2147483639),
//Why is integer.max here_ Value - 8 is because some virtual machines retain some header information in the array to prevent memory overflow
private static final int MAX_ARRAY_SIZE = Integer.MAX_VALUE - 8;

//This is inherited from AbstractList and represents the number of modifications to the ArrayList set
protected transient int modCount = 0;

Constructor

non-parameter constructor

If no parameters are passed in, the ArrayLisy object is created using the default parameterless construction method, as follows:

    /**
     * Constructs an empty list with an initial capacity of ten. 
     Construct an empty list with an initial capacity of 10
     */
    public ArrayList() {
        this.elementData = DEFAULTCAPACITY_EMPTY_ELEMENTDATA;
    }

Note: at this time, the length of elementData in the ArrayList object we created is 0 and the size is 0. When we add for the first time, the elementDate will become the default length: 10. We'll talk about it later

Constructor with int type

If the parameter is passed in, it represents the initial array length of the specified ArrayList; If the passed in parameter is greater than 0, it is initialized with the user's parameter; If the parameter is equal to 0, the empty object empty inside is used_ The address of elementData is directly assigned to elementData; Otherwise, an exception is thrown, as follows:

/**
 Constructs an empty list with the specified initial capacity.
Parameter: initialCapacity – initial capacity of the list
 Throw: IllegalArgumentException - if the specified initial capacity is negative
 */
public ArrayList(int initialCapacity) {
    if (initialCapacity > 0) {
        this.elementData = new Object[initialCapacity];
    } else if (initialCapacity == 0) {
        this.elementData = EMPTY_ELEMENTDATA;
    } else {
        throw new IllegalArgumentException("Illegal Capacity: "+
                                           initialCapacity);
    }
}

Constructor with Collection object

  1. Convert the Collection object into an array, and then assign the address of the array to elementData.
  2. Update the value of size. If the value of size is equal to 0, empty the internal object directly_ The address of elementData is assigned to elementData.
  3. If the value of size is greater than 0, execute the Arrays.copy method to copy (which can be understood as deep copy) the contents of the collection object into elementData, and these elements are arranged in the order returned by the iterator of the collection.
/**
 Constructs a list containing the specified collection elements in the order returned by the collection's iterator.
Parameter: c – the collection whose elements will be placed in this list
 Throw: NullPointerException - if the specified collection is empty
 */
public ArrayList(Collection<? extends E> c) {
    elementData = c.toArray();
    if ((size = elementData.length) != 0) {
        // c.toArray might (incorrectly) not return Object[] (see 6260652)
        if (elementData.getClass() != Object[].class)
            elementData = Arrays.copyOf(elementData, size, Object[].class);
    } else {
        // replace with empty array.
        this.elementData = EMPTY_ELEMENTDATA;
    }
}

System.arraycopy and Arrays.copy

The System.arraycopy and Arrays.copy methods are introduced here, because they are often used in post analysis of source code.

System.arraycopy method: it copies the elements from the specified source array to the target array, starts at the specified position, ends at the set copy length, and then inserts them successively from the specified starting position of the target array. The native native method will eventually be called.

    // src source array
    // Start position of srcPos source array to copy
    // dest the target array to assign to
    // destPos the starting position where the target array is placed
    // Length the length of the copy
    // The use of the native keyword indicates that the underlying functions written in other languages are called
    public static native void arraycopy(Object src,  int  srcPos,
                                        Object dest, int destPos,
                                        int length);

Arrays.copy method: it creates a new array, copies the contents of the original array to a new array with a length of newLength, and returns the new array.

    // original array to copy
    // newLength the length of the copy to return
    // newwType the copy type to return
    // The System.arraycopy method was called internally
    public static <T,U> T[] copyOf(U[] original, int newLength, Class<? extends T[]> newType) {
        @SuppressWarnings("unchecked")
        T[] copy = ((Object)newType == (Object)Object[].class)
            ? (T[]) new Object[newLength]
            : (T[]) Array.newInstance(newType.getComponentType(), newLength);
        System.arraycopy(original, 0, copy, 0,
                         Math.min(original.length, newLength));
        return copy;
    }

difference:

  1. System.arraycopy needs the target array. Copy the original array to the target array, and you can choose the starting point and length of the copy and the position in the new array.
  2. Arrays.copyof automatically creates an array as the target array internally. Call System.arraycopy to copy the contents of the original array to the target array with length of newLength and return the newly created target array.

Add element

ArrayList provides five methods: add (E), add(int index, E element), addall (collection <? Extensions E > C), addall (int index, collection <? Extensions E > C) and set(int index, E element) to increase ArrayList.

add(E e)

/**
 Appends the specified element to the end of this list.
Parameters: e - elements to attach to this list
 Return: true (specified by Collection.add)
 */
public boolean add(E e) {
    ensureCapacityInternal(size + 1);  // Increments modCount!!
    elementData[size++] = e;
    return true;
}

Next, let's look at the ensureCapacityInternal method and the methods it calls internally.

/**
 *This method is to judge whether the current array is an empty array,
 *If yes, it returns the default length of 10; otherwise, it returns size+1;
 *That is, if you initialize ArrayList with a parameterless constructor, the default length will become 10 when you call the add method for the first time
*/
private void ensureCapacityInternal(int minCapacity) {
    // 1. This method is to judge whether the current array is an empty array
    if (elementData == DEFAULTCAPACITY_EMPTY_ELEMENTDATA) {
        // 2. If yes, the default length of 10 will be returned
        //That is, if you use new ArrayList < > (), the default length will become 10 when you call the add method for the first time
        minCapacity = Math.max(DEFAULT_CAPACITY, minCapacity);
    }

    ensureExplicitCapacity(minCapacity);
}

Next, let's look at the ensureExplicitCapacity method and the methods it calls internally.

// Record the modification times and judge whether the capacity needs to be expanded
private void ensureExplicitCapacity(int minCapacity) {
    // This method first increases the modification times of the collection by 1, and the modCount field is in the AbstractList class,
    modCount++;

    // overflow-conscious code
    //Then determine whether the length of the array can be stored in the next element
    if (minCapacity - elementData.length > 0)
        // If the length is not enough, the grow th method will be called for capacity expansion
        grow(minCapacity);
}

Next, let's look at the grow th method and the methods it calls internally.

//This method first defines the new length of the array, which is 1.5 times the length of the original array,
//If the new length minus the minimum length of the required array is less than 0, the new length is equal to the minimum length of the required array;
//The following judgment is if the new length is greater than max_ ARRAY_ Size (the value of MAX_ARRAY_SIZE defined in ArrayList is 2147483639)
//hugeCapacity method, finally call Arrays.copyOf to assign the new array address after expansion to elementData.
    private void grow(int minCapacity) {
        // overflow-conscious code
        int oldCapacity = elementData.length;
        //This method first defines the new length of the array, which is 1.5 times the length of the original array,
        int newCapacity = oldCapacity + (oldCapacity >> 1);
        //If the new length minus the minimum length of the required array is less than 0, the new length is equal to the minimum length of the required array;
        if (newCapacity - minCapacity < 0)
            newCapacity = minCapacity;
        //Then judge whether the new length is greater than max in the following_ ARRAY_ Size (the value of MAX_ARRAY_SIZE defined in ArrayList is 2147483639)
        if (newCapacity - MAX_ARRAY_SIZE > 0)
            // Call the hugeCapacity method to set the new length
            newCapacity = hugeCapacity(minCapacity);
        // minCapacity is usually close to size, so this is a win:
        // Finally, call Arrays.copyOf to assign the new array address after expansion to elementData.
        elementData = Arrays.copyOf(elementData, newCapacity);
    }

Next, let's look at the hugeCapacity method and the methods it calls internally.

// Set new length   
private static int hugeCapacity(int minCapacity) {
    if (minCapacity < 0) // overflow
        throw new OutOfMemoryError();
    // If the expansion length exceeds MAX_ARRAY_SIZE, set the length to Integer.MAX_VALUE
    // But not 100% successful, depending on the virtual machine.
    //(if we can avoid OutOfMemory on some virtual machines, we will allocate Integer.MAX_VALUE separately,
    // If you are lucky (depending on the virtual machine), we will succeed)
    return (minCapacity > MAX_ARRAY_SIZE) ?
        Integer.MAX_VALUE :
    MAX_ARRAY_SIZE;
}

Finally, summarize the logic of the add method:

  1. Make sure that the array has used the length (size) plus 1 to store the next element.
  2. The number of modifications modCount flag increases by 1. If the number of current array elements + 1 is greater than the length of the current array, call the growth method to expand the array. The growth method will change the capacity of the current array to 1.5 times the original capacity.
  3. After ensuring that the new element has a place to store, add the new element to the location in size + +.
  4. Returns the Boolean value of successful addition.

add(int index, E element)

This method is similar to the above add method, which can specify the insertion position of the new element according to the position of the element.

    public void add(int index, E element) {
        //1. Judge whether the index position is correct
        rangeCheckForAdd(index);
        //2. Capacity expansion test
        ensureCapacityInternal(size + 1);  // Increments modCount!!
        //3. Copy (shift) the source array from index + 1 to size - index
        //It is equivalent to moving the index and the following elements back one bit
        System.arraycopy(elementData, index, elementData, index + 1,
                         size - index);
        //4. Assign a value at the specified position
        elementData[index] = element;
        size++;
    }   

Next, let's look at the rangeCheckForAdd method and the methods it calls internally.

// This method first calls the rangeCheckForAdd method to judge that the specified position is less than the length of the current array and greater than 0, otherwise an exception is thrown.
private void rangeCheckForAdd(int index) {
    if (index > size || index < 0)
        throw new IndexOutOfBoundsException(outOfBoundsMsg(index));
}

The ensureCapacityInternal method called in the second step has the same logic as the add method above.

The third step is to call the System.arraycopy method to move the specified subscript and subsequent elements back one bit.

Finally, put the new element in the specified position (index) and add size+1.

addAll(Collection<? extends E> c)

    //Inserts at the end of the list in the order returned by the specified Collection iterator.
    public boolean addAll(Collection<? extends E> c) {
        // Convert c to an array
        Object[] a = c.toArray();
        int numNew = a.length;
        //Capacity expansion, size + numNew
        ensureCapacityInternal(size + numNew);  // Increments modCount
        System.arraycopy(a, 0, elementData, size, numNew);
        size += numNew;
        return numNew != 0;
    }

This method first converts the passed Collection set into an array, then expands the capacity, and then uses System.arraycopy to copy the converted array to the end of the list.

addAll(int index, Collection<? extends E> c)

    public boolean addAll(int index, Collection<? extends E> c) {
        //Determine whether the index position is correct
        rangeCheckForAdd(index);
        // Convert c to an array
        Object[] a = c.toArray();
        int numNew = a.length;
        //Capacity expansion, size + numNew
        ensureCapacityInternal(size + numNew);  // Increments modCount
        //If the inserted index is less than the length of the list, move the elements whose current index is equal to and greater than index back numMoved positions
        int numMoved = size - index;
        if (numMoved > 0)
            System.arraycopy(elementData, index, elementData, index + numNew,
                             numMoved);
        //Adds an array to the end of the list
        System.arraycopy(a, 0, elementData, index, numNew);
        //Update list length
        size += numNew;
        return numNew != 80;
    }

set(int index, E element)

    public E set(int index, E element) {
        //Judge whether the insertion position is correct. If it is greater than the length of the list, an exception will be thrown
        rangeCheck(index);
        //Gets the current element at the insertion location
        E oldValue = elementData(index);
        //Replace the new element with the element at the current insertion position
        elementData[index] = element;
        //Returns the old value of the insertion position
        return oldValue;
    }

Next, let's look at the rangeCheck method and the methods it calls internally.

// Determine whether the index position is correct
private void rangeCheck(int index) {
    if (index >= size)
        throw new IndexOutOfBoundsException(outOfBoundsMsg(index));
}

Next, let's look at the elementData method and the methods it calls internally.

E elementData(int index) {
    return (E) elementData[index];
}

Delete element

ArrayList provides four methods: remove(int index), remove(Object o), removeAll (collection <? > C) and clear() to delete elements.

remove(int index)

/**
 Removes the element at the specified location in this list. Moves any subsequent elements to the left (minus one from their index).
Parameter: index – the index of the element to be deleted
 Return: elements deleted from the list
 Throw: IndexOutOfBoundsException –
 */
public E remove(int index) {
    //Judge whether the deletion position is correct. If it is greater than the length of the list, an exception will be thrown
    rangeCheck(index);

    //Increase the number of set modifications by 1
    modCount++;
    //Gets the element at the current deletion location
    E oldValue = elementData(index);
    
    //Determine whether to delete the last element,
    int numMoved = size - index - 1;
    if (numMoved > 0)
        // Move the element after the deletion position to the left numMoved position
        System.arraycopy(elementData, index+1, elementData, index,
                         numMoved);
    
    //Set the last element of the list to null and wait for the garbage collector to collect
    elementData[--size] = null; // clear to let GC do its work
    
    //Returns the old value of the delete location
    return oldValue;
}

Main deletion process: check whether the index is correct, get the value corresponding to the index, copy and shrink the array, set the last position to null, and return the value corresponding to the index just now.

remove(Object o)

/**
 Removes the first occurrence of the specified element, if any, from this list. If the list does not contain the element, it remains unchanged. More formally, delete the element with the lowest index i so that (o = = null? Get (i) = = null: o.equals (get (i)) (if such an element exists). Returns true if the list contains the specified element (or, equivalently, if the list changes due to a call).
Parameter: o - the element to remove from this list, if any
 Return: tru if this list contains the specified element
 */
public boolean remove(Object o) {
    //Because ArrayList allows nulls, null judgment is required
    if (o == null) {
        for (int index = 0; index < size; index++)
            if (elementData[index] == null) {
                fastRemove(index);
                return true;
            }
    } else {
        for (int index = 0; index < size; index++)
            if (o.equals(elementData[index])) {
                fastRemove(index);
                return true;
            }
    }
    return false;
}

Next, let's look at the fastRemove method and the methods it calls internally.

/**
This method is basically the same as remove(int index), except that it skips the boundary check and does not return the removed value, and it is a private method
 */
private void fastRemove(int index) {
    //Increase the number of set modifications by 1      
    modCount++;
    //Determine whether to delete the last element,
    int numMoved = size - index - 1;
    if (numMoved > 0)
        //If not, move the deleted element to the left numMoved position
        System.arraycopy(elementData, index+1, elementData, index,
                         numMoved);
    //Set the last element of the list to null and wait for the garbage collector to collect
    elementData[--size] = null; // clear to let GC do its work
}

removeAll(Collection<?> c)

    /**
Removes all elements contained in the specified collection from this list.
Parameters:
c – The collection containing the elements to remove from this list
 return:
true if this list changes due to a call
 Throw:
ClassCastException – If the class of the element of this list is incompatible with the specified collection (optional)
NullPointerException – If this list contains empty elements and the specified collection does not allow empty elements (optional), or the specified collection is empty
 You can also see:
Collection.contains(Object)
     */

public boolean removeAll(Collection<?> c) {
    //Make a judgment and throw an exception if c is null
    Objects.requireNonNull(c);
    return batchRemove(c, false);
}

// Next, let's look at the batchRemove method and the methods it calls internally.
private boolean batchRemove(Collection<?> c, boolean complement) {
    final Object[] elementData = this.elementData;
    int r = 0, w = 0;
    boolean modified = false;
    try {
        //Traverse the array and check whether the collection contains the corresponding value,
        //Move the value to be retained to the front of the array, w and the last value is the number of elements to be retained,
        for (; r < size; r++)
            if (c.contains(elementData[r]) == complement)
                elementData[w++] = elementData[r];
    } finally {
        // Ensure that the part before the exception is thrown can complete the desired operation, and the traversed part will be connected to the back
        //r is not equal to size, indicating that an error may have occurred
        if (r != size) {
            System.arraycopy(elementData, r,
                             elementData, w,
                             size - r);
            w += size - r;
        }
        
  //If w equals size, it means that all elements are reserved, so no deletion operation occurs, so false will be returned; Conversely, return true and change the array
  //When w is not equal to size, even if the try block throws an exception, it can correctly handle the operation before the exception is thrown, because W is always the length of the front section to be retained, and the array will not be out of order
        if (w != size) {
            // clear to let GC do its work
            // Elements with a subscript greater than or equal to w need to be deleted, because the elements to be retained in the try block above are moved to the front of the array, that is, those elements with a subscript less than W are retained
            for (int i = w; i < size; i++)
                elementData[i] = null;
            // Record the number of array modifications,
            modCount += size - w;
            // Sets the number of the latest elements of the array
            size = w;
            // Return deletion success
            modified = true;
        }
    }
    return modified;
}

clear()

    public void clear() {
        //Increase the number of set modifications by 1  
        modCount++;
        //The loop sets all elements in the list to null and waits for the garbage collector to collect them
        // clear to let GC do its work
        for (int i = 0; i < size; i++)
            elementData[i] = null;
        //Set the list length to 0
        size = 0;
    }

Find element

ArrayList provides get(int index) to read elements in ArrayList. Because ArrayList is a dynamic array, we can get the elements in ArrayList according to the subscript, and the speed is relatively fast.

    public E get(int index) {
        //Judge whether the deletion position is correct. If it is greater than the length of the list, an exception will be thrown
        rangeCheck(index);
        //Directly return the elements in the list whose subscript is equal to index
        return elementData(index);
    }

Determine whether the element exists in the list

ArrayList provides contents (object o) to determine whether elements exist in the list.

Note: the contains method traverses the ArrayList.

    public boolean contains(Object o) {
        //Call the indexOf method to determine whether the subscript of the element to be searched in the list is greater than or equal to 0. If it is less than 0, it does not exist
        return indexOf(o) >= 0;
    }

// Find element subscript, similar to remove(Object o)
    public int indexOf(Object o) {
        //Because ArrayList allows nulls, null judgment is required
        if (o == null) {
            //Traverse the list. If there is an element with null value in the list, directly return its subscript position
            for (int i = 0; i < size; i++)
                if (elementData[i]==null)
                    return i;
        } else {
            //Traverse the list, use equals to judge whether there are equal elements, and if so, directly return its subscript position
            for (int i = 0; i < size; i++)
                if (o.equals(elementData[i]))
                    return i;
        }
        //The passed in element cannot exist in the list. Return - 1
        return -1;
    }

Minimize the actual storage of ArrayList

ArrayList provides a trimToSize() method to adjust the capacity of the underlying array to the size of the actual elements saved in the current list

    public void trimToSize() {
        //Increase the number of set modifications by 1
        modCount++;
        //If the actual length of the current ArrayList is less than the length of the internally maintained array, remove the free space (including null value) after the internal array exceeds the size, and call the Arrays.cppyof method to copy elementData with the length of size
        if (size < elementData.length) {
            elementData = (size == 0)
              ? EMPTY_ELEMENTDATA
              : Arrays.copyOf(elementData, size);
        }
    }

Intercept part of ArrayList

ArrayList provides a subList(int fromIndex, int toIndex) method to intercept some data.

You can see from the source code that it actually creates an internal object of sublist, which can be understood as returning part of the view of the current ArrayList. In fact, it points to a place where the data is stored. If the content returned by sublist is modified, the original content will also be modified.

Because the get and set methods of sublist are directly modified references.

    public List<E> subList(int fromIndex, int toIndex) {
        //Check whether the subscript position to be intercepted is correct
        subListRangeCheck(fromIndex, toIndex, size);
        return new SubList(this, 0, fromIndex, toIndex);
    }

    static void subListRangeCheck(int fromIndex, int toIndex, int size) {
        if (fromIndex < 0)
            throw new IndexOutOfBoundsException("fromIndex = " + fromIndex);
        if (toIndex > size)
            throw new IndexOutOfBoundsException("toIndex = " + toIndex);
        if (fromIndex > toIndex)
            throw new IllegalArgumentException("fromIndex(" + fromIndex +
                                               ") > toIndex(" + toIndex + ")");
    }

// Is an inner class of ArrayList
        SubList(AbstractList<E> parent,
                int offset, int fromIndex, int toIndex) {
            this.parent = parent;
            this.parentOffset = fromIndex;
            this.offset = offset + fromIndex;
            this.size = toIndex - fromIndex;
            this.modCount = ArrayList.this.modCount;
        }

Summary

ArrayList implements serialization and deserialization by itself because it implements the writeObject and readObject methods.

ArrayList is implemented based on array and will be expanded automatically.

When adding an element, you will judge whether it needs to be expanded. You'd better specify an approximate size to prevent memory consumption caused by multiple expansion later; When deleting an element, the capacity will not be reduced. When deleting an element, set the deleted location element to null, and the next gc will automatically reclaim the space occupied by these elements.

ArrayList is thread unsafe.

Tags: Java

Posted on Wed, 24 Nov 2021 07:19:02 -0500 by cmzone