Near linear sorting of common sorting

preface

In the previous chapter, the blogger described that the time complexity is O(n) ²) Today, Bo mainly introduces four heap sorting algorithms, fast sorting, merge sorting and Hill sorting. What they have in common is that the time complexity of the first three is O(n*logn), followed by O(n^1.3), which is close to linear complexity

Heap sort

The content of heap sorting has been described in detail by bloggers when explaining the binary tree part

Therefore, for convenience, we will put the link directly here Heap sort

Quick sort

Before we talk about quick sorting, let's do a question about the Dutch flag

Dutch flag issue:

There is an unordered array. It is required to divide the array into three areas. The left area is less than a certain value, the middle area is equal to a certain value, and the right area is greater than a certain value

Example 1

Input:

[6,2,4,9,8,5,7,5,3,6,5,7,8,5,9];
k = 5;

Output:

[2,4,3,5,5,5,5,7,6,8,7,8,9,9,6]

It can be clearly seen that those less than 5 are on the left, those equal to 5 are in the middle, and those greater than 5 are on the right

Algorithm implementation:

  • Initialization is less than the area less < = - 1 and greater than the area more > = n
  • Set the current pointer cur, starting from index 0
  • If the value pointed to by cur is less than k, the value of cur is exchanged with the next value of less, and then less++,cur + +;
  • If the value pointed to by cur is equal to k, it doesn't matter, cur + +;
  • If the value pointed to by cur is greater than k, the value of cur is exchanged with the previous value of more, and then more++,cur remains unchanged. Why does it remain unchanged? Because after the previous exchange with the more area, the value pointed to by cur may still be greater than K;

Illustration:

So the code is:

int less = -1,more = n,cur = 0,k = 5;
while(cur<more)
{
    if(num[cur] < k) swap(&num[cur++],&num[++less]); //If it is less than k, it will exchange with the next location of less, and then less plus, cur plus
    else if(num[cur] > k) swap(&num[cur],&num[--more]);//If it is greater than k, it exchanges with the previous position of more, and then more subtracts and cur remains unchanged
    else cur++; //If it's equal to k, it doesn't matter. cur moves back
}

Now that the Dutch flag problem has been solved, what about the quick sort code?

We know that after the Dutch flag problem is completed once, the left is less than k, the right is greater than k, and the middle is equal to K. then it means that the middle is orderly, and there is no need to manage it. On the contrary, the parts that need to be sorted are the less area and the more area

Therefore, the quick sorting content is to perform the Dutch flag problem once, and then continue to recursively sort the less region and the more region, but the k value is our choice

If there is only one element left in the array, it indicates that sorting is not required

void QuickSort(int num[],int l,int r)  //l is the left area of the array and r is the right area of the array
{
    if(l>=r) return ;
    
    int less = l-1,more = r+1,cur = l,k = num[(l+r)>>1]; //k value (L + R) > > 1 is equivalent to (l+r)/2
 
    while(cur<more)
    {
        if(num[cur] < k) swap(&num[cur++],&num[++less]); //If it is less than k, it will exchange with the next location of less, and then less plus, cur plus
        else if(num[cur] > k) swap(&num[cur],&num[--more]);//If it is greater than k, it exchanges with the previous position of more, and then more subtracts and cur remains unchanged
        else cur++; //If it's equal to k, it doesn't matter. cur moves back
    }
    
    QuickSort(num,l,less);  //The issue of Dutch flag in the area from l to less
    QuickSort(num,more,r);  //The Dutch flag issue will be carried out in more to r areas
}

Test:

Merge sort

The idea of merging and sorting is to divide the array into two parts, sort the left, then sort the right, then compare the sorted numbers on the left and right, put them into the temporary array according to the size, and then put them back into the original array

The following is the merging diagram (that is, compare the sizes of the left and right sides of the existing sequential array, put it into the temporary array, and then return to the original array). The steps are as follows:

  • For the left part, point to (l) from the beginning with a pointer
  • For the right part, point from the beginning with a pointer ®
  • If the left arr[l] is less than or equal to the right arr[r], put the left arr[l] into the temporary array, and then L++
  • If the left arr[l] is greater than the right arr[r], put the right arr[r] into the temporary array, and then r++
  • If l or r reaches the boundary, stop, and then check which part has the remaining elements in turn, and put them all into the temporary array

The premise of the above dynamic graph is that the left and right parts are orderly. How to make the left and right parts orderly?

If there are only two data, one on the left and one on the right, can we think that the left and right sides are orderly?

Then we merge the two elements according to the idea in the figure above. For the whole array, we split it first and then merge it. Then the whole becomes orderly, as shown in the figure below

How to write the code? Let's write the merge code from simple to complex. According to the merge steps, the code is as follows:

int tmp[1000] = {0}
int i,j,k,mid;
mid  = (l+r)>>1;
//Since the original array is first divided into two halves and then merged, mid is the boundary between the left and right arrays, that is, mid is the midpoint of the original array
for(i = l,j = mid + 1,k = 0;i<=mid && j<=r;k++)   //Either on the left or on the right, when one side reaches the boundary, it stops
{
    if(num[i]<=num[j]) tmp[k] = num[i++]; //The left is smaller than the right, put num[i] into the tmp array, and then I++
    else tmp[k] = num[j++];               //The left is larger than the right, put num[j] into the tmp array, and then j++
}

while(i<=mid) tmp[k++] = num[i++];  //If the boundary comes first on the right and there are others on the left, put them all into the tmp array in turn

while(j<=r) tmp[k++] = num[j++];  //If the boundary comes first on the left and there are others on the right, put them all into the tmp array in turn

for(i = 0,j = l;i<k;i++,j++) num[j] = tmp[i];      //Put the value of the temporary array back to the original array

We have written the merge code. What about merge sorting? In fact, the blogger has said above that merge sorting actually does not exist. The reason why it will be orderly is that when we recursively divide the array, we will eventually divide it into the case that there is only one number on the left and right. At this time, merge begins to play a role. When this layer recursion ends, it will return to the upper layer Layer recursion, the upper layer does not have only one number on the left and right, but there are two numbers on the left and right. However, since the left and right numbers have been merged in order before, they are merged again and again... Finally, they are orderly, as shown in the figure above. Therefore, we can draw a conclusion: the merging sort is not sorted, but the core is to continuously divide regions downward. After reaching the bottom line, they are merged upward and finally orderly

Recursive partition code:

void MergeSort(int num[],int l,int r)
{
    if(l>=r) return ;             //If there is only one element left, stop recursive partition and return to the previous level of recursion
    
    int mid = (l+r)>>1;           //First divide the original array into two parts
    MergeSort(num,l,mid);         //The divided left array (l to mid area) is divided in the same way
    MergeSort(num,mid+1,r);       //The divided right array (mid+1 to r area) is divided in the same way
    
    //When you look at recursion, you must make clear the definition of recursion. For example, MergeSort is to divide regions and form order when returning gradually
    //Therefore, after the left and right sides are divided into regions, the left and right sides have been orderly. Let's merge them
    int i = 0,j = 0,k = 0;    
    int tmp[1000] = {0};
    for(i = l,j = mid + 1,k = 0;i<=mid && j<=r;k++)   //Either on the left or on the right, when one side reaches the boundary, it stops
    {
        if(num[i]<=num[j]) tmp[k] = num[i++]; //The left is smaller than the right, put num[i] into the tmp array, and then I++
        else tmp[k] = num[j++];               //The left is larger than the right, put num[j] into the tmp array, and then j++
    }
    while(i<=mid) tmp[k++] = num[i++];  //If the boundary comes first on the right and there are others on the left, put them all into the tmp array in turn
    while(j<=r) tmp[k++] = num[j++];  //If the boundary comes first on the left and there are others on the right, put them all into the tmp array in turn
    for(i = 0,j = l;i<k;i++,j++) num[j] = tmp[i];      //Put the value of the temporary array back to the original array
}

Test:

Shell Sort

Hill sort is an advanced version of insert sort. Let's first review what kind of data is suitable for insert sort? The answer is that the data is partially ordered, and the process of Hill sort is to make some adjustments before insert sort to make the data as partially ordered as possible

So how to make the data part orderly? This is what hill, an awesome man, came up with. He separated the data into multiple blocks, for example, gap, and then inserted and sorted according to the gap distance

For example, if there is data [9,8,7,6,5,4,3,2,1,5,4,3], we divide it by three data distances, and then insert and sort it in turn, as shown in the following figure:

  • For the blue line, there are 9635 data on it. After sorting the four data, it is 3569, and the original data becomes [387 554 621 943]
  • For the orange line, there are 8524 data on it. After sorting the four data, it is 2458, and the original data becomes [3 27 5 4 6 5 1 9 8 3]
  • For the black line, there are 7413 data on it. After sorting the four data, it is 1347, and the original data becomes [321 5 4 3 6 5 4 9 8 7]

It can be seen that the data has been partially ordered 3 2 1 5 4 3 6 5 4 9 8 7

If we continue to change gap to 2, and then continue to perform relevant operations, and finally gap becomes 1, that is, really insert sorting, these whole operations together are Hill sorting

Writing any piece of code, we should follow from simple to complex, so what are the relatively simple steps of hill sorting? Yes, it is sorting the data separated by gap distance

According to the above steps, we first make * * the whole blue line orderly, then the orange line and finally the black line, * * but writing code like this will be a time-consuming and energy-consuming work. In fact, we can change a simple idea and finally achieve the above effect. What is the idea?

We iterate directly next to the array, and then exchange the number of gap distances. What do you mean?

  • We directly start with the second data on the blue line (indexed gap), that is, 6, and then 6 and the first data exchange on the blue line become 6 and 9
  • Then we start with the second data of the orange line (the index is gap+1), that is, 5, and then the first data exchange between 5 and the orange line becomes 5 and 8
  • Then we start with the second data of the black line (the index is gap+2), that is, 4, and then the first data exchange between 4 and the black line becomes 4 and 7
  • Then it starts from the third data of the blue line, that is, 3, and then 3 and the first two data of the blue line are sorted into 3, 6, 9
  • Then it starts from the third data of the orange line, that is, 2, and then 2 and the first two data of the orange line are sorted into 2, 5, 8
  • Then start with the third data of the black line, that is, 2, and then 1 and the first two data of the black line are sorted into 1, 4, 7

That is, we directly start from the index gap, and then traverse backward. In the process of traversal, we sort the data of gap at a distance. In this process, we always cycle the blue line, orange line, black line, and blue line, orange line, black line... So as to achieve the effect of first overall blue line, overall orange line, and overall black line

Similarly, how to write code? Let's go from simple to complex, first give gap a value, take 3 as an example

int gap = 3;
for(int i = gap;i<n;i++)
{
    int min_index = i;
    int target = num[i];
    for(int j = i;j>=0;j-=gap)    //The insertion is calculated according to the gap distance, so j-=gap
    {
        if(num[j] < num[j-gap]) num[j] = num[j-gap],min_index = j-gap; //This is the same as the insertion sort, except that the insertion sort subtracts 1
    }
	num[min_index] = target;      //Insert in the correct position
}

Now that the simplest steps are written, we begin to complete the hill sorting code. Since hill is an advanced version of insert sorting, where is the core of hill? Yes, the core is the value of gap, because we need to ensure that after preprocessing some values, we can finally carry out real insertion sorting, that is, gap is equal to 1. How should we take the value of gap? Thanks to our predecessors' hard work, they have come to the conclusion through a large number of experiments that gap is the best value to use gap = gap/3+1

Therefore, the complete Hill code is as follows:

void ShellSort(int num[],int n)
{
    int  gap = n;
    while(gap>1)
    {
        gap = gap/3 + 1;
        for(int i = gap;i<n;i++)
        {
            int min_index = i;
            int target = num[i];
            for(int j = i;j>=gap;j-=gap)    //The insertion is calculated according to the gap distance, so j-=gap, J > = gap, because j-gap cannot cross the boundary
            {
                if(target < num[j-gap]) 
                    num[j] = num[j-gap],min_index = j-gap; //This is the same as the insertion sort, except that the insertion sort subtracts 1
            }
            num[min_index] = target;      //Insert in the correct position
        }
    }
}

Test:

Tags: Algorithm data structure

Posted on Mon, 20 Sep 2021 11:40:38 -0400 by wizzard81