Common interview questions in Java collection of school recruitment interview

Interview summary

The main questions in this section are ArrayList and HashMap. HashMap this piece of source code must be understood. In the eyes of some interviewers, it is a necessary foundation.

Question summary and answer sorting (for reference only)

1. The difference between ArrayList and LinkedList (both thread unsafe)

  1. ArrayList is implemented based on Object array, while LinkedList is implemented based on bidirectional linked list
  2. ArrayList supports random access, and the query speed is faster than LinkedList, but LinkedList does not need to constantly shift like ArrayList when adding and deleting
  3. The overhead of memory space is different: the space of ArrayList is wasted on the need to reserve a certain capacity space, while the space of LinkedList is mainly spent on storing the pointer information of precursor nodes and successor nodes

2. ArrayList related knowledge

  1. ArrayList inherits from AbstractList, and the initial default capacity is 10 (private static final int DEFAULT_CAPACITY = 10)

  2. ArrayList has three constructors. One does not pass parameters to generate an elementData array with an initial capacity of 10 by default, the other passes in an initial capacity parameter. If it is less than 0, an exception will be thrown, and the last one passes in a specified collection

    The constructor passed in the initial capacity is as follows:

    public ArrayList(int initialCapacity) {
            if (initialCapacity > 0) {
                this.elementData = new Object[initialCapacity];
            } else if (initialCapacity == 0) {
                this.elementData = EMPTY_ELEMENTDATA;
            } else {
                throw new IllegalArgumentException("Illegal Capacity: "+ initialCapacity);
            }
        }
    
  3. When ArrayList uses add to add elements, it will call the grow(size+1) method to increase the capacity

    private Object[] grow(int minCapacity) {
            return elementData = Arrays.copyOf(elementData,newCapacity(minCapacity));
     }
     private int newCapacity(int minCapacity) {
            int oldCapacity = elementData.length;
            int newCapacity = oldCapacity + (oldCapacity >> 1);
            if (newCapacity - minCapacity <= 0) {
                if (elementData == DEFAULTCAPACITY_EMPTY_ELEMENTDATA)
                    return Math.max(DEFAULT_CAPACITY, minCapacity);
                if (minCapacity < 0)
                    throw new OutOfMemoryError();
                return minCapacity;
            }
            return (newCapacity - MAX_ARRAY_SIZE <= 0)? newCapacity: hugeCapacity(minCapacity);
     }
    

    Max of ArrayList in the above code_ ARRAY_ Size is Integer.MAX_VALUE-8

3. HashMap,HashTable,ConcurrentHashMap

3.1 differences between HashMap and HashTable

  1. HashMap is thread unsafe, while HashTable is thread safe because it has a synchronized lock.
  2. The initial capacity of HashMap is 16 by default, and each expansion is power times of 2, while the initial capacity of HashTable is 11 and the expansion is 2n+1.
  3. HashMap can insert null as the key, but HashTable cannot. NullPointerException will be thrown.
  4. After Java 1.8, when HashMap resolves hash conflicts, when the length of the linked list is greater than 8, it will turn the linked list into a red black tree, while HashTable will not be and is basically abandoned because of efficiency problems

3.2 underlying implementation of HashMap

The data structure of HashMap is:

JDK1.7 and before: array + linked list; JDK1.8: array + linked list + red black tree

  1. Static constant (common)

    static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; 
    static final int MAXIMUM_CAPACITY = 1 << 30;
    static final float DEFAULT_LOAD_FACTOR = 0.75f;  //Load factor
    
  2. Storage data structure in hash bucket array

    //This is an internal class of HashMap, which implements the Map.Entry interface. Its construction method contains four parameters: hash, key, value and next.
    //Then, some other methods are defined, such as getKey(),getValue(),toString(),setValue(), and the equals method is overridden.
    static class Node<K,V> implements Map.Entry<K,V> {
        final int hash;
        final K key;
        V value;
        Node<K,V> next;
        Node(int hash, K key, V value, Node<K,V> next) {
            this.hash = hash;
            this.key = key;
            this.value = value;
            this.next = next;
        }
        public final K getKey()        { return key; }
        public final V getValue()      { return value; }
        public final String toString() { return key + "=" + value; }
        public final int hashCode() {
            return Objects.hashCode(key) ^ Objects.hashCode(value);
        }
        public final V setValue(V newValue) {
            V oldValue = value;
            value = newValue;
            return oldValue;
        }
        public final boolean equals(Object o) {
            if (o == this)
                return true;
            if (o instanceof Map.Entry) {
                Map.Entry<?,?> e = (Map.Entry<?,?>)o;
                if (Objects.equals(key, e.getKey()) &&
                    Objects.equals(value, e.getValue()))
                    return true;
            }
            return false;
        }
    }
    
  3. Construction method (4)

    //First incoming initial capacity and load factor
    public HashMap(int initialCapacity, float loadFactor) {
        if (initialCapacity < 0)
            throw new IllegalArgumentException("Illegal initial capacity: " + initialCapacity);
        if (initialCapacity > MAXIMUM_CAPACITY)
            initialCapacity = MAXIMUM_CAPACITY;
        if (loadFactor <= 0 || Float.isNaN(loadFactor))
            throw new IllegalArgumentException("Illegal load factor: " + loadFactor);
        this.loadFactor = loadFactor;
        this.threshold = tableSizeFor(initialCapacity);
    }
    //Second incoming initial capacity
    public HashMap(int initialCapacity) {
        this(initialCapacity, DEFAULT_LOAD_FACTOR);
    }
    //The third does not pass in any value
    public HashMap() {
        this.loadFactor = DEFAULT_LOAD_FACTOR; // all other fields defaulted
    }
    //The fourth one passes in a specified Map
    public HashMap(Map<? extends K, ? extends V> m) {
        this.loadFactor = DEFAULT_LOAD_FACTOR;
        putMapEntries(m, false);
    }
    
  4. hash function

    static final int hash(Object key) {
        int h;
        return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
    }
    
  5. put source code

    public V put(K key, V value) {
        return putVal(hash(key), key, value, false, true);
    }
    //The parameter onlyIfAbsent indicates whether to replace the original value. We can ignore the parameter evict, which is mainly used to distinguish between adding data through put or initializing data during creation
    final V putVal(int hash, K key, V value, boolean onlyIfAbsent, boolean evict) {
        Node<K,V>[] tab; Node<K,V> p; int n, i;
        //First, judge whether the table is empty or the length is 0. If so, call resize() to expand the capacity
        if ((tab = table) == null || (n = tab.length) == 0)
            n = (tab = resize()).length;
        //Calculate the index in the hash bucket according to the hash value. If there is no node in the current index, directly create a new node to add
        if ((p = tab[i = (n - 1) & hash]) == null)
            tab[i] = newNode(hash, key, value, null);
        //If there are nodes
        else {
            Node<K,V> e; K k;
            //Judge whether the key to be inserted is equal to the first element in the current index. If it is equal to, it will be overwritten directly
            if (p.hash == hash && ((k = p.key) == key || (key != null && key.equals(k))))
                e = p;
            //Judge whether it is a red black tree. If so, insert nodes directly into the tree
            else if (p instanceof TreeNode)
                e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
            //If not, start traversing the linked list for insertion. If the length of the linked list is greater than 8, it will be converted to red black tree insertion. Otherwise, the linked list will be inserted. If the key exists, it will be directly overwritten
            else {
                for (int binCount = 0; ; ++binCount) {
                    if ((e = p.next) == null) {
                        p.next = newNode(hash, key, value, null);
                        if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
                            treeifyBin(tab, hash);
                        break;
                    }
                    if (e.hash == hash && ((k = e.key) == key || (key != null && key.equals(k))))
                        break;
                    p = e;
                }
            }
            if (e != null) { // existing mapping for key
                V oldValue = e.value;
                if (!onlyIfAbsent || oldValue == null)
                    e.value = value;
                afterNodeAccess(e);
                return oldValue;
            }
        }
        ++modCount;
        //After successful insertion, judge whether the number of key value pairs exceeds the maximum capacity threshold. If it exceeds the maximum capacity threshold, the capacity will be expanded
        if (++size > threshold)
            resize();
        afterNodeInsertion(evict);
        return null;
    }
    
  6. get source code

    public V get(Object key) {
        Node<K,V> e;
        return (e = getNode(hash(key), key)) == null ? null : e.value;
    }
    final Node<K,V> getNode(int hash, Object key) {
        Node<K,V>[] tab; Node<K,V> first, e; int n; K k;
        //First, judge whether the table is empty, whether its length is greater than 0, and whether the current position is empty. If one is yes, return null instead of continuing to judge
        if ((tab = table) != null && (n = tab.length) > 0 && (first = tab[(n - 1) & hash]) != null) {
        //Judge whether the first node in the current position is equal to the key. If yes, return. If no, traverse the next node, and then judge whether it is a tree node. If yes, traverse the linked list.
            if (first.hash == hash && // always check first node
                ((k = first.key) == key || (key != null && key.equals(k))))
                return first;
            if ((e = first.next) != null) {
                if (first instanceof TreeNode)
                    return ((TreeNode<K,V>)first).getTreeNode(hash, key);
                do {
                    if (e.hash == hash &&
                        ((k = e.key) == key || (key != null && key.equals(k))))
                        return e;
                } while ((e = e.next) != null);
            }
        }
        return null;
    }
    

3.3 capacity expansion mechanism of HashMap

The following code is the source code of Java 1.7

void resize(int newCapacity) {   //Incoming new capacity
     Entry[] oldTable = table;    //Reference the Entry array before capacity expansion
     int oldCapacity = oldTable.length;         
     if (oldCapacity == MAXIMUM_CAPACITY) {  //If the array size before capacity expansion has reached the maximum (2 ^ 30)
         threshold = Integer.MAX_VALUE; //Modify the threshold value to the maximum value of int (2 ^ 31-1), so that the capacity will not be expanded in the future
         return;
     }   
    Entry[] newTable = new Entry[newCapacity];  //Initializes a new Entry array
    transfer(newTable);                         //Transfer the data to the new Entry array
    table = newTable;                           //The table property of HashMap refers to the new Entry array
    threshold = (int)(newCapacity * loadFactor);//Modify threshold
}

What are the optimizations for capacity expansion in Java 1.8?

In Java version 1.7, the hash value needs to be recalculated every time when the capacity is expanded. However, because HashMap uses the power of 2, it can be found that each capacity expansion,
The position of the element is either in the original position or moves the position of the newly expanded size on the original position. The position of the element on the hash bucket is obtained by (length-1) & hash,
Therefore, it is not necessary to recalculate the hash value every time. Just look at whether the bit added to the hash before is 1 or 0.

Why should HashMap be expanded to the power of 2?

Because only when length is the power of 2, hash%length = = (length-1) & hash, the use of bitwise operators can greatly improve the computational efficiency

3.4 why is HashMap thread unsafe

The possible problems of HashMap during concurrency are mainly in the following two aspects:

  1. Multi thread data inconsistency caused by put

For example, there are two threads A and B. first, A wants to insert A key value pair into the HashMap. First, A calculates the index coordinates of the hash bucket to which the record will fall, and then obtains the chain header node in the bucket. At this time, the time slice of thread A runs out. At this time, thread B is scheduled to execute like thread A, but thread B successfully inserts the record into the bucket, Assuming that the hash bucket index calculated from the record inserted by thread A is the same as the hash bucket index calculated from the record to be inserted by thread B, when thread A is scheduled to run again after thread B is successfully inserted, it still holds the expired chain header, but it knows nothing about it, so that it thinks it should do so. In this way, it overwrites the records inserted by thread B, In this way, the records inserted by thread B disappear out of thin air, resulting in inconsistent data behavior.

  1. resize to cause an endless loop (this problem will no longer occur in JDK1.8)

This happens when HashMap is automatically expanded in jdk1.7. When two threads detect that the number of elements exceeds the array size at the same time × Load factor. At this point, the 2 threads will call resize() in the put() method, and the two thread modifies a linked list structure at the same time, which will produce a circular list. In JDK1.7, there will be an inverted sequence of elements before and after resize. Next, if you want to get an element through get(), an endless loop will appear

3.5 why does HashMap use red black trees and other balance trees

Because the search time complexity of red black tree and other balance trees is O(log(n)), but the insertion and deletion of red black tree can be balanced up to two rotations, while the rotation of other balance trees is much more complex than that of red black tree

When will the linked list structure of HashMap be transformed into a red black tree

When the length is 8

Why convert when length is 8

Strictly speaking, this is determined by probability, because when the length of the linked list is greater than or equal to 8, it follows the Poisson distribution to turn into a red black tree

3.6 why is the load factor of HashMap 0.75 by default

If the loading factor is set too large, such as 1, it means that each empty bit of the array needs to be filled. However, if the capacity is expanded after the array is filled, although the maximum array space utilization is reached, a large number of hash collisions will occur and more linked lists will be generated. If it is set too small, such as 0.5, it ensures sufficient array space and reduces hash collision. In this case, the query efficiency is very high, but it consumes a lot of space.

Therefore, we need to make a compromise between time and space, select the most appropriate load factor to ensure optimization, and get 0.75

3.7 why does HashMap's hash process need high 16 bit and low 16 bit XOR

This is related to the calculation of table subscript in HashMap.

n = table.length;
index = (n-1) & hash;

Because the length of the table is a power of 2, the index is only related to the lower n bits of the hash value, and the high bits of the hash value are set to 0 by the and operation. Suppose table.length=24=16

[the external chain image transfer fails. The source station may have an anti-theft chain mechanism. It is recommended to save the image and upload it directly (img-bitejtv7-1632552369746) (C: \ users \ miraclewk \ appdata \ roaming \ typora \ user images \ image-20210706101655387. PNG)]

As can be seen from the above figure, only the lower 4 bits of the hash value participate in the operation. This can easily lead to collision. Designers weigh speed, utility, and quality, and XOR the high 16 bits with the low 16 bits to reduce this effect.

3.8 differences between HashMap and ConcurrentHashMap

3.9 the difference between concurrenthashmap 1.7 and 1.8 and the source code, how to achieve thread safety

3.8 and 3.9 recommend you to read the blog Implementation principle and analysis of concurrent HashMap

Tags: Java Interview set HashMap arraylist

Posted on Sat, 25 Sep 2021 02:02:54 -0400 by phrygius