Beauty of Interval K Large Number Query

Problem: In an unordered sequence, find the K-th largest number in a given interval

Method 1: Sort first, then find the K-th largest number directly

This method is the most general, easy to think of, and has no limitations; however, it is inefficient with O(n*log n) time complexity (using an efficient sorting algorithm)

The time complexity of some efficient sorting algorithms, such as Quick Sort and Heap Sort, is O(n*log n)

Method 2: K-sublinear scan

The precondition for this method is that K is not too large; the time complexity is O(k*n)

This is done by finding the current maximum number during each scan and removing it from the sequence so that the largest number in K can be found by scanning K times...For example: the first scan finds the maximum number and moves it out of the sequence; then the second scan finds the maximum number, which is the second largest number and moves it out of the sequence; the third scan...

(Of course, you can also just scan once and record the largest number in K.This can also be done if K is small...

Method 3: Swap space for time

The precondition for this method is that all values in the sequence are not negative and the maximum cannot be too large

Open an array, the length of the array is the maximum that may occur, and the initial value is set to 0, indicating that the number of occurrences is 0. Then, count the number of occurrences of each number in turn according to the sequence (for example, the sequence is 23,1,4,50,4; arr[1] = 1, arr[4] = 2, arr[23] = 1, arr[50] = 1, the rest are 0)

When looking for the number that is the largest in K, start with the number that is the largest and has an array value other than 0. Reverse the search and add up the count. When the count >=K, the number that is the largest in K is found

Method 4: Use the Quick Sort principle (focus on mastering and understanding algorithm ideas)

The principle of fast sorting is that during the process of converting an unordered sequence to an ordered sequence using fast sorting, a pivot is found in each division, resulting in a larger number on the left and a smaller number on the right of the pivot; and then the same operation is performed on the subsequences.Do, recursively, until all subsequences are in order.

Reform: Our goal is to find the largest number in K. If there are K to the left of the pivot (plus itself) after a division, then the number at the pivot position is the largest number in K we are looking for. First, we do not need to order the first K-1 large number, so we do not need to continue dividing recursively. Second, each division divides the original sequence into left and left.For the right two subsequences, the fast sorting algorithm needs to continue dividing these two subsequences recursively, whereas here we only need to select a subsequence to continue dividing according to the situation (see algorithm steps specifically); therefore, the time complexity of this K-large number algorithm is less than that of the fast sorting algorithm, which is O(n*logK)

Algorithmic steps: After a partition, the pivot divides the original sequence into two parts: S and T [i.e., the original sequence becomes (S T), pivot is contained in the subsequence S, and note the use of descending sorting]. The following three situations occur:

There are K numbers in the subsequence S, where the pivot position is the largest number in K, and the algorithm ends
The number of numbers in subsequence S is less than K. Assuming that the number is L, the number of K-L in subsequence T is needed before the subsequence T can be further divided recursively.
If the number of numbers in the subsequence S is greater than K, the number of K in the subsequence S needs to continue to be divided recursively


Here's the code:
 

#include <iostream>

/* run this program using the console pauser or add your own getch, system("pause") or input loop */

using namespace std;

//Quick Sort One Division 
int sort(int a[],int low,int high)
{
	int i=low,j=high,x=a[low];
	while(i<j)
	{
		while(i<j && a[j]<=x)
		{
			j--;
		}
		a[i]=a[j];
		while(i<j && a[i]>=x)
		{
			i++;
		}
		a[j]=a[i];	
	}

	a[i]=x;
	return i;
}

void FindMake(int a[], int low, int high, int k)
{
	if(low<high)
	{
		int pivot=sort(a,low,high);
		int len=pivot-low+1;
		if(len<k)
		{
			FindMake(a,pivot+1,high,k-len);
		}
		else if(len>k)
		{
			FindMake(a,low,pivot-1,k);
		}
	}
}

int main(int argc, char *argv[]) {
	
	int arr[]={3,2,6,8,9};
	int leng=sizeof(arr)/sizeof(int);
	int l,r,k;//Interval starts at 1
	cin>>l>>r>>k;
	
	FindMake(arr,l-1,r-1,k); 
	cout<<arr[l-1+k-1]<<endl;
	
	return 0;
}

tips: Here we calculate the number of the largest K. If we need to find the number of the first K, we can simply output the array a[l-1]~a[l-1 + k-1] with a total of K numbers; of course, the sequence of elements inside is not necessarily completely ordered.

Result:

Method 5: Use the principle of heap sorting (master and understand its ideas)

Our goal is to find the K-th largest number, and accordingly, the principle of this approach is to maintain a small top heap (minimum heap) with K elements, which is the K-th largest number we are looking for.The efficiency and time complexity of the fast sorting algorithm are also O(n*log K).

Specific algorithm steps:

Put the first K number of intervals given by the original sequence into the small-top heap for initial heap building
Then compare the remaining numbers in the interval to the top of the heap, replace the top if it is larger than the top element, and re-maintain the structure of the heap

Here's the code:
 

#include <iostream>
#include <cstdio>
#include <cstring>
#include <cstdlib>
using namespace std;
 
/*
 Small Top Heap: A subprocess of the filtering process
  Adjust the current i-node to ensure that the value of i-node is <=the value of its left and right children
  Then do the same thing for the right and left children...(recursive)
  This is why reverse filtering (from the last non-terminal node to the root node) is required
*/
void adjust_loop(int *heap, int i, int len) {
    if (i < len) {// Since the heap is used to hold the heap, the purpose here is to determine if it is out of bounds
        int l = 2 * i + 1, r = 2 * i + 2;// Left and right children of i node
        if (r < len && heap[r] < heap[i]) {// If there is a right child and the value of the i node is less than ~
            int tmp = heap[i];
            heap[i] = heap[r];
            heap[r] = tmp;
            adjust_loop(heap, r, len);// Continue Screening Adjust Right Child
        }
        if (l < len && heap[l] < heap[i]) {// If there is a left child and the value of the i node is less than ~
            int tmp = heap[i];
            heap[i] = heap[l];
            heap[l] = tmp;
            adjust_loop(heap, l, len);// Continue filtering to adjust left child
        }
    }
}
 
/*
 Small Top Heap: Filtering Process
  Filter from the last non-terminal node to the root node
*/
void heap_adjust(int *heap, int len) {
    for (int i = (len-1)/2; i >= 0; i--) {
        adjust_loop(heap, i, len);
    }
}
 
/*
 Complete the process of building a small top heap
  That is, the goal of this function is to put the maximum number of K into a small top heap, where the top is the largest number of K.
*/
void find_kmax(int arr[], int l, int r, int k,  int *heap) {
    // Build heap first for the first k elements in interval [l, r]
    for (int i = l; i < k+l; i++) {
        heap[i-l] = arr[i];
    }
    heap_adjust(heap, k);
 
    // The remaining (r-l+1)-k elements are updated
    for (int j = k+l; j <= r; j++) {
        // Larger than the minimum value in the heap and can be inserted into the heap
        // (because we want to find the first k big number to go into the heap)
        if (arr[j] > heap[0]) {
            heap[0] = arr[j];
            heap_adjust(heap, k);
        }
    }
}
 
int main() {
    int arr[] = {23, 50, 500, 4, 100, 300, 200, 99, 400};
    int len = sizeof(arr) / sizeof(int);// Find the length of the array
 
    int l, r, k;// The interval is [l, r], starting from 1
    scanf("%d %d %d", &l, &r, &k);
 
    int *heap = new int[k];
    find_kmax(arr, l-1, r-1, k, heap);// Note the conversion between interval subscript and array subscript
    printf("%d\n", heap[0]);// A small top heap is created, with the top being the k th largest element
    delete heap;
 
    return 0;
}

tips: Similarly, if you need to count the first K, you can just output this small top heap directly; of course, the sequence of elements inside is not necessarily perfectly ordered.

 

 

 

 

 

 

 

74 original articles were published, 146 were praised, 10,000 visits+
Private letter follow

Tags: less REST

Posted on Fri, 07 Feb 2020 19:58:10 -0500 by alecapone