Data structures and algorithms: sorting

sort

1, Simple sort

In our program, sorting is a very common requirement. We provide some data elements and sort these data elements according to certain rules. For example, query some orders and sort them according to the order date; Another example is to query some commodities, sort them according to the price of commodities, and so on. So, next we will learn some common sorting algorithms.

In the java development kit jdk, we have been provided with the implementation of many data structures and algorithms, such as list, set, map, math, etc., which are provided in the form of API. The advantage of this method is that it can be written at one time and used in many places. We use the jdk method for reference and also encapsulate the algorithm into a class. If so, before we write java code, we need to design the API first, and then implement these APIs.

Then use java code to implement it. In the future, we will explain any data structure and algorithm in this way.

1.1. Implementation of Comparable interface

Since we will talk about sorting here, we will certainly compare between elements. Java provides an interface. Comparable is used to define sorting rules. Here, we will briefly review the comparable interface in the form of a case.

Requirements:

  • Define a Student class with age and name and username attributes, and provide comparison rules through the Comparable interface
  • Define the Test class Test, and define the Test method comparable getmax (comparable C1, comparable 2) in the Test class Test to complete the Test
public class student implements Comparable<student>{
    private String username;
    private int age;
    public String getUsername() {
        return username;
    }
    public void setUsername(String username) {
        this.username = username;
    }

    public int getAge() {
        return age;
    }
    public void setAge(int age) {
        this.age = age;
    }
    @Override
    public String toString() {
        return "student{" +
                "username='" + username + '\'' +
                ", age=" + age +
                '}';
    }
    @Override
    public int compareTo(student o) {
        return this.age-o.getAge();
    }
}

ublic class testcomparable {
    public static void main(String[] args) {
        // Create two student objects and call the getmax method to complete the test
        student s1 = new student();
        s1.setUsername("Zhang San");
        s1.setAge(17);

        student s2 = new student();
        s2.setUsername("Li Si");
        s2.setAge(19);

        Comparable res = getMax(s1, s2);
        System.out.println(res);
    }
    public static Comparable getMax(Comparable c1, Comparable c2){
        int result = c1.compareTo(c2);
        if (result >= 0){
            return c1;
        }else {
            return c2;
        }
    }

1.2 bubble sorting

Sorting principle:

  1. Compare adjacent elements. If the previous element is larger than the latter, the positions of the two elements are exchanged.
  2. Do the same for each pair of adjacent elements, from the first pair of elements to the last pair of elements at the end. The element in the final position is the maximum value.

Bubble sorting API design:

Code implementation of bubble sorting:

public class Bubble {
   /*
   Sorts the elements in the array
    */
    public static void sort(Comparable[] a){
        for (int i= a.length-1;i>0; i--){
            for (int j = 0; j < i; j++) {
                if (greater(a[j],a[j+1])){
                    exch(a,j,j+1);
                }
            }
        }
    }
    /*
    Compare whether the v element is larger than the w element
     */
    private static boolean greater(Comparable v,Comparable w){

        return v.compareTo(w)>0;
    }
    /*
    Array elements i and j swap positions
     */
    private static void exch(Comparable[] a, int i, int j){
        Comparable temp;
        temp = a[i];
        a[i] = a[j];
        a[j] = temp;
    }
}

  public static void main(String[] args) {
        try {
            // Because Integer implements the Comparable interface, not int
            Integer[] arr = {4,1,6,66,46,33,3,5};
            Bubble.sort(arr);
            System.out.println(Arrays.toString(arr));
        } catch (Exception e) {
            e.printStackTrace();
        }
    }



Time complexity analysis of bubble sorting

Bubble sorting uses a double-layer for loop, in which the loop body of the inner loop is the code that really completes the sorting. Therefore, we analyze the time complexity of bubble sorting, mainly analyzing the execution times of the inner loop body. In the worst case, that is, if the elements to be sorted are {6,5,4,3,2,1} in reverse order, the number of element comparisons is: N^2-N;
According to the large o derivation rule, if the highest order term in the function is retained, the time complexity of the final bubble sorting is O(N^2)

1.3. Selection and sorting

Sorting principle:

  1. During each traversal, it is assumed that the element at the first index is the minimum value, which is compared with the values at other indexes in turn. If the value at the current index is greater than the value at some other index, it is assumed that the value derived from some other index is the minimum value, and finally the index where the minimum value is located can be found
  2. Swap the values at the first index and at the index where the minimum value is located

Select Sorting API design:

Select the code implementation for sorting:

 // Sort the elements in the array
        for (int i = 0; i < a.length-1; i++) {
            // Define a variable to record the index of the smallest element. By default, it is the position of the first element participating in sorting
            int minIndex = i;
            for (int j = i+1; j < a.length; j++) {
                //You need to compare the value at the minimum index minIndex with the value at the j index
                if(greater(a[minIndex],a[j])){
                    minIndex = j;
                }
            }
            //Swap the value at index minIndex where the smallest element is located and the value at index i
            exch(a,minIndex,i);

Select time complexity analysis for sorting:

  • The selection sorting uses a double-layer for loop, in which the outer loop completes data exchange and the inner loop completes data comparison. Therefore, we count the times of data exchange and data comparison respectively:
  • Data comparison times: (N-1)+(N-2)+(N-3) +... + 2+1=((N-1)+1)*(N-1)/2=N^2/2-N/2;
  • Data exchange times: N-1
  • Time complexity: N^2/2-N/2+(N-1) = N^2/2+N/2-1;
  • According to the large o derivation rule, the highest order term is retained and the constant factor is removed. The time complexity is O(N^2);

1.4. Insert sort

  • Insertion sort is a simple, intuitive and stable sorting algorithm.
  • Insertion sort works very much like people sort a hand of playing cards. At first, our left hand is empty and the cards on the table face down. Then, we take a card from the table and insert it into the correct position in our left hand. In order to find the correct position of a card, we compare it with each card already in our hand from right to left, as shown in the figure below:

Sorting principle:

  • Divide all elements into two groups, sorted and unordered;
  • Find the first element in the unordered group and insert it into the sorted group;
  • The flashback traverses the sorted elements and compares them with the elements to be inserted in turn until the line reaches the smaller position, and the other elements move back one bit;

Insert sort API design:

Insert sort code implementation:

 for (int i = 1; i < a.length; i++) {
            //The current element is a[i]. Compare it with the elements before I to find an element less than or equal to a[i]
            for (int j = i; j >0; j--) {
                //Compare the value at index lj with the value at index lj-1; if the value at index j-1 is larger than the value at index j, exchange data: if it is not large, find a suitable position: exit the loop;
                if(greater(a[j-1],a[j])){
                    // Exchange element
                    exch(a, j-1, j);
                }else {
                    //Element break found
                    break;
                }
            }
        }

Select time complexity analysis for sorting:

  • Insert sorting uses a double-layer for loop, in which the loop body of the inner loop is the code that really completes sorting
  • In the worst case, the number of comparisons is: (N-1)+(N-2)+(N-3) +... + 2+1=((N-1)+1)(N-1)/2=N 2/2-N/2;
  • The number of exchanges is: (N-1)+(N-2)+(N-3) +... + 2+1=((N-1)+1)*(N-1)/2=N2^2-N/2;
  • The total execution times are: (N2/2-N/2)+(N 2/2-N/2)=N2-N;
  • According to big О According to the derivation rule, if the highest order term in the function is retained, the time complexity of the final insertion sorting is O(N^2)

2, Advanced sorting

2.1. Hill sorting

Sorting principle:

  1. Select a growth amount h and group the data according to the growth amount h as the basis for data grouping
  2. Insert and sort each group of data divided into groups;
  3. Reduce the growth to 1 and repeat the second step.

Sorting algorithm:

int h = 1;
        while (h<= a.length/2){
            h = 2*h+1;
        }
        //2. Hill sort
        while (h>=1){
            //sort
            //2.1. Find the element to be inserted
            for (int i = h; i < a.length; i++) {
                //2.2 insert the elements to be inserted into the ordered sequence
                for (int j = i; j >= h; j-=h) {
                    //The element to be inserted is a[j], compare a[j] and a[j-h]
                    if (greater(a[j-h],a[j])){
                        //Exchange element
                        exch(a,j-h,j);
                    }else{
                        //The element to be inserted has found the appropriate position and ends the loop
                        break;
                    }
                }
            }
            //Decrease the value of h
            h = h/2;
        }

Post event time complexity analysis:

2.2 merging and sorting

2.2.1 recursion

Definition: when defining a method, the method itself is called inside the method, which is called recursion

Note: in recursion, you can't call yourself unlimited. You must have boundary conditions to end the recursion, because each recursive call will open up a new space in the stack memory and re execute the method. If the recursion level is too deep, it is easy to cause stack memory overflow.

2.2.2 merging and sorting

Merge sort is an effective sort algorithm based on merge operation. This algorithm is a very typical application of divide and conquer method. The ordered subsequences are combined to obtain a completely ordered sequence; That is, each subsequence is ordered first, and then the subsequence segments are ordered. If two ordered tables are merged into one, it is called two-way merging.

Sorting principle:
  1. Try to split a set of data into two subgroups with equal elements, and continue to split each subgroup until the number of elements in each subgroup is 1
  2. Merging two adjacent subgroups into an ordered large group;
  3. Repeat step 2 until there is only one group.

Merge and sort API design:

Merge sort time complexity analysis:

Use the tree view to describe merging. If an array has 8 elements, it will divide by 2 each time to find the smallest sub array. It will be disassembled for 8 times and the value is 3. Therefore, the tree has 3 layers. There are 2k sub arrays from top to bottom. The length of each array is 2 (3-k). Merging requires at most 2 (3-k) comparisons. Therefore, the comparison times of each layer is 2k*2(3-k)=23, so the three layers are 3 2 ^ 3 in total.
Assuming that the number of elements is n, the number of splits by merging sorting is log2(n). Therefore, there are lag2(n) layers, and log2(n) is used to replace 3 of the above 3 23. The time complexity of merging sorting is: log2(n)*2log2(n))=log2(n)*n. according to the big O derivation rule, the base number is ignored, and the time complexity of merging sorting is o(nlogn);

Merge sort code:

// Auxiliary array required for merging
private static Comparable[] assist;

/*
 Compare whether the v element is smaller than the w element
 */
private static boolean less(Comparable v, Comparable w){
    return v.compareTo(w)<0;
}

/*
Array elements i and j swap positions
 */
private static void exch(Comparable[] a, int i, int j){
    Comparable t = a[i];
    a[i] = a[j];
    a[j] = t;
}

/*
  Sort the elements in array a
 */
public static void sort(Comparable[] a){
    //1. Initialize auxiliary array assit
    assist = new Comparable[a.length];
    //2. Define a lo variable and hi variable to record the smallest index and the largest index in the array respectively
    int lo = 0;
    int hi = a.length-1;
    //3. Call the sort overloaded method to sort the elements in array a from index lo to index hi
    sort(a, lo, hi);

}

/*
Sort the elements from lo to hi in a in the array
 */
private static void sort(Comparable[] a, int lo, int hi){
    //Do security verification
    if (hi<=lo){
        return;
    }
    //Divide the data between lo and hi into two groups
    int mid = lo + (hi-lo)/2;

    //Sort each group of arrays separately
    sort(a, lo, mid);
    sort(a, mid+1, hi);

    //Then merge the data in the two groups
    merge(a, lo, mid, hi);
}

/*
In the array, from lo to mid is a group, and from mid+1 to hi is a group. Merge the two groups of data
 */
private static void merge(Comparable[] a, int lo, int mid, int hi){
    // 1. Define three pointers
    int i = lo;
    int p1 = lo;
    int p2 = mid+1;

    // 2. Traverse, move the p1 pointer and p2 pointer, compare the value at the corresponding index, find the small one and put it at the corresponding index of the auxiliary array
    while(p1<=mid && p2<=hi){
        if (less(a[p1],a[p2])){
            assist[i++] = a[p1++];
        }else {
            assist[i++] = a[p2++];
        }
    }

    // 3. Traversal. If the p1 pointer is not completed, move the p1 pointer in sequence and put the corresponding element at the corresponding index of the auxiliary array
    while (p1<=mid){
        assist[i++]=a[p1++];
    }

    // 4. Traversal. If the p2 pointer is not finished, move the p2 pointer in sequence and put the corresponding element at the corresponding index of the auxiliary array
    while (p2<=hi){
        assist[i++]=a[p2++];
    }

    // 5. Copy the elements in the auxiliary array to the original array
    for (int index = lo; index < hi; index++) {
        a[index] = assist[index];
    }
}

2.3. Quick sort

Sorting principle:

  1. First, set a boundary value, and divide the array into left and right parts through the boundary value;
  2. Put the data greater than or equal to the boundary value to the right of the array, and the data less than the boundary value to the left of the array. At this time, all elements in the left part are less than or equal to the boundary value, while all elements in the right part are greater than or equal to the boundary value;
  3. Then, the data on the left and right can be sorted independently. For the array data on the left, you can take another boundary value and divide this part of the data into left and right parts. Similarly, place the smaller value on the left and the larger value on the right. The array data on the right can also be processed similarly.
  4. By repeating the above process, we can see that this is a recursive definition. After the left part is sorted recursively, the right part is sorted recursively. When the data of the left and right parts are sorted, the sorting of the whole array is completed.

Segmentation principle:

  1. Find a reference value and point to the head and tail of the array with two pointers respectively;
  2. First, search for an element smaller than the reference value from the tail to the head, stop the search, and record the position of the pointer;
  3. Then, search for an element larger than the reference value from the head to the tail, stop when the search is found, and record the position of the pointer;
  4. Exchange the elements of the current left pointer position and the right pointer position;
  5. Repeat steps 2, 3 and 4 until the value of the left pointer is greater than that of the right pointer.
 // 1. Determine the boundary value
        Comparable key = a[lo];
        // Define two pointers to point to the next position of the minimum index value and the maximum index value of the element to be segmented
        int left = lo;
        int right = hi+1;

// 2. Segmentation
        while (true){
            //Scan from right to left, move the right pointer, find an element smaller than the decomposition value, and stop
            while (less(key,a[--right])){
                if (right==lo){
                    break;
                }
            }
            //Scan from left to right, move the left pointer, find an element larger than the decomposition value, and stop
            while (less(a[++left],key)){
                if (left==hi){
                    break;
                }
            }
            //Judge left > = right. If yes, it proves that the element scanning is completed. If not, it can exchange elements
            if(left>=right){
                break;
            }else {
                exch(a,left,right);
            }
        }
        // Exchange decomposition value
        exch(a, lo, right);
        return right;
    }
}

Difference between quick sort and merge sort:

Similarities:

Quick sort is another divide and conquer sorting algorithm. It divides an array into two sub arrays and sorts the two parts independently.

difference:
  1. Quick sort and merge sort are complementary. Merge sort divides the array into two sub arrays to sort respectively, and merges the ordered sub arrays to sort the whole array. The way of quick sort is that when both arrays are ordered, the whole array will be naturally ordered;
  2. In the merge sort, an array is equally divided into two halves. The merge call occurs before processing the whole array. In the quick sort, the position of the split array depends on the content of the dry array, and the old tune Tian FA Niu is after the external management of an array.

2.4 stability of sorting

There are several elements in array arr, in which element A is equal to element B, and element A is in front of element B. If A sorting algorithm is used to sort, it can ensure that element A is still in front of element B, it can be said that the algorithm is stable. If A group of data only needs to be sorted once, the stability is generally meaningless. If A group of data needs to be sorted multiple times, the stability is meaningful. For example, the content to be sorted is A group of commodity objects. The first sorting is sorted by price from low to high, and the second sorting is sorted by sales volume from high to low. If the stability algorithm is used in the second sorting, objects with the same sales volume can still be displayed in the order of price, and only objects with different sales volumes need to be reordered. This can not only keep the original meaning of the first sorting, but also reduce the system overhead.

Stability of common sorting algorithms:

  • Bubble sort: only when arr [i] > arr [i + 1], the positions of elements will be exchanged, and when they are equal, they will not be exchanged. Therefore, bubble sort is a stable sort algorithm.
  • Selective sorting: selective sorting is to select the smallest current element for each position. For example, there is data (5 (1), 8, 5 (2), 2, 9). The smallest element selected for the first time is 2, so 5 (1) will exchange positions with 2. At this time, the stability is destroyed when 5 (1) comes after 5 (2). Therefore, selective sorting is an unstable sorting algorithm.
  • Insertion sort: the comparison starts from the end of the ordered sequence, that is, the element to be inserted starts from the largest one that has been ordered. If it is larger than it, it will be inserted directly behind it. Otherwise, it will look forward until it finds the insertion position. If you encounter an element that is equal to the inserted element, put the element to be inserted after the equal element. Therefore, the sequence of equal elements has not changed. The order out of the original unordered sequence is the order after the order is arranged, so the insertion sort is stable.
  • Hill sort: Hill sort sorts the elements according to the unsynchronized length. Although one insertion sort is stable and does not change the relative order of the same elements, in different insertion sort processes, the same elements may move in their own insertion sort, and finally their stability will be disturbed. Therefore, Hill sort is unstable.
  • Merge sort: in the process of merge sort, only when arr [i] < arr [i + 1] will the position be exchanged. If the two elements are equal, the position will not be exchanged, so it will not destroy the stability. Merge sort is stable.
  • Quick sort: quick sort requires a benchmark value. Find an element smaller than the benchmark value on the right side of the benchmark value and an element larger than the benchmark value on the left side of the benchmark value, and then exchange the two elements. At this time, stability will be destroyed. Therefore, quick sort is an unstable algorithm.

The tail starts, that is, the element you want to insert is compared with the largest one that has been ordered. If it is larger than it, it is directly inserted behind it. Otherwise, keep looking forward until you find the position where it should be inserted. If you encounter an element that is equal to the inserted element, put the element to be inserted after the equal element. Therefore, the sequence of equal elements has not changed. The order out of the original unordered sequence is the order after the order is arranged, so the insertion sort is stable.

  • Hill sort: Hill sort sorts the elements according to the unsynchronized length. Although one insertion sort is stable and does not change the relative order of the same elements, in different insertion sort processes, the same elements may move in their own insertion sort, and finally their stability will be disturbed. Therefore, Hill sort is unstable.
  • Merge sort: in the process of merge sort, only when arr [i] < arr [i + 1] will the position be exchanged. If the two elements are equal, the position will not be exchanged, so it will not destroy the stability. Merge sort is stable.
  • Quick sort: quick sort requires a benchmark value. Find an element smaller than the benchmark value on the right side of the benchmark value and an element larger than the benchmark value on the left side of the benchmark value, and then exchange the two elements. At this time, stability will be destroyed. Therefore, quick sort is an unstable algorithm.

Tags: Java Algorithm data structure

Posted on Wed, 22 Sep 2021 08:42:02 -0400 by Darkpower