# Heap and Huffman tree (data structure: under the tree by Zhejiang University mooc)

## Heap

Priority queue: for a special queue, the order of taking out elements is based on the priority (keyword) size of elements, not the order in which elements enter the queue

The priority queue is realized by array or linked list

Array:

• Insert, element always inserts tail O(1)
• Delete, find the maximum (or minimum) keyword O(n), and delete the element O(n) to be moved from the array

• Insert, the element is always inserted into the head O(1) of the linked list
• Delete, find the maximum (or minimum) keyword O(n), and delete node O(1)

Ordered array

• Insert, find the appropriate location O(n) or O(log2(n)), move the element and insert
• Delete, delete the last element O(1)

Store with binary tree

Binary tree structure is adopted, with emphasis on deleting the maximum value

Heap, implemented with a complete binary tree, satisfies that any node value is greater than your node value on the left and right sides of it

Heap has two characteristics

• Structure, a complete binary tree represented by an array
• In order, the keyword of any node is the maximum or minimum value of all nodes of its subtree
• Maximum heap, large top heap, maximum
• Minimum heap, small top heap, minimum
• Ordering of node sequence on the path from root node to any node

The data object set is a complete binary tree. The element value of each node is not less than that of its child nodes

#### Main operation

MaxHeap Crearte(int maxsize): creates an empty maximum heap

Boolean IsFull(MaxHeap H): judge whether the maximum heap H is full

Insert(MaxHeap H, elementtype item): insert the element item into the maximum heap H

Boolean IsEmpty(MaxHeap H) to judge whether the maximum heap H is empty

ElementType DeleteMax (MaxHeap H), returns the largest element in H (high priority)

#### Heap creation

struct heapstruct
{
element* elements;//An array that stores heap elements
int size;//Current number of elements in the heap
int capacity;//Maximum heap capacity
};

maxheap create(int maxsize)
{
//Create empty Max stack with maxsize capacity
maxheap h = (maxheap)malloc(sizeof(struct heapstruct));
h->elements = (element*)malloc((maxsize + 1) * sizeof(int));
h->size = 0;
h->capacity = maxsize;
h->elements = maxdata;
//The sentinel is defined as a value greater than all possible elements in the heap for faster operation in the future
return h;
}



#### Heap insertion

//Maximum heap insertion
//Algorithm: insert the new node into the ordered sequence from its parent node to the root node

void insert(maxheap h, element item)
{//Insert the element item into the maximum value h, where h - > elements  has been defined as sentinel
int i;
if (isfull(h))
{
cout << "Maximum heap full" << endl;
return;
}
i = ++h->size;//i points to the position of the last element in the heap after insertion
for (; h->elements[i / 2] < item&&i>1; i /= 2)//i is used to indicate the current location to be placed. The parent node is i/2
{
h->elements[i] = h->elements[i / 2];//Downward filtering node
}
h->elements[i] = item;//Insert item into
}


#### Deletion of heap

//Deletion of maximum heap
//Take out the element of the root node (maximum value), delete a node of the heap, and replace the element of the root node with the element of the next value

element deletemax(maxheap h)
{
//Extract the element with the largest key value from the maximum heap h and delete a node
int parent, child;
element maxitem, temp;
if (isempty(h))
{
cout << "Maximum heap empty" << endl;
return;
}
maxitem = h->elements;
//Take out the maximum value of the root node and save the tree root element to be deleted
//Use the last element in the maximum value to filter the lower nodes from the root node upward
temp = h->elements[h->size--];
for (parent = 1; parent * 2 <= h->size; parent = child)
{//Parent * 2 < = H - > size judge whether there are left and right sons
child = parent * 2;//Child points to the left son, and child+1 points to the right son
if ((child != h->size) && (h->elements[child]) < h->elements[child + 1])
{//Child! = H - > size judge whether there is a right son
//Child points to the larger of the left and right child nodes
child++;
}
if (temp >= h->elements[child])
{
break;
}
else
{//Move temp element to the next level
h->elements[parent] = h->elements[child];
}
}
h->elements[parent] = temp;
return maxitem;
}


#### Establishment of maximum heap

1. Create maximum heap: store the existing N elements in a one-dimensional array according to the requirements of the maximum heap
• Through the insertion operation, N elements are successively inserted into an initially empty heap, and the maximum time cost is O(NlogN)
• Establishing maximum heap under linear time complexity
• The N elements are stored in the input order to meet the structural characteristics of the complete binary tree
• Adjust the position of each node to meet the ordered characteristics of the maximum heap

## Huffman tree and Huffman coding

Convert a 100 point test score into a five point score

According to the different search frequencies of nodes, a more effective search tree is constructed

#### Definition of Huffman tree

Weighted path length (WPL): let the binary tree have n leaf nodes, each leaf node has a weight wk, and the length from the root node to each leaf node is lk, then the sum of the weighted path lengths of each leaf node is:
W P L = ∑ i = 1 n w k l k WPL=\sum_{i=1}^{n}{w_kl_k} WPL=i=1∑n​wk​lk​
Objective: to minimize the weighted path length

Optimal binary tree or Huffman tree: the smallest binary tree in WPL

#### Construction of Huffman tree

Each time the two binary trees with the smallest weight are merged to form a new binary tree

#### Characteristics of Huffman tree

• There is no node with degree 1

• The Huffman tree with n leaf nodes has 2n-1 nodes

n0: total number of leaf nodes

n1: total number of nodes with only one son

n2: total number of nodes with 2 sons

n2=n0-1

• The left and right subtrees of any non leaf node of a Huffman tree are still Huffman trees after being exchanged
• For the same set of weights, two Huffman trees with different structures can be generated, but the whole WPL value will be the same

#### Huffman coding

Given a string, how to encode characters can minimize the encoding storage space of the string

How to avoid ambiguity:

Prefix code: the code of any character is not the prefix of another character code

It can be decoded without ambiguity

Coding with binary tree

1. Left and right branches: 0, 1
2. Characters are only on leaf nodes

Constructing with Huffman tree

## Representation of sets

Set operation: intersection, union, complement and difference to determine whether an element belongs to a set

Joint query set: what set does an element belong to when a set is merged or subtracted

A set can be represented by a tree structure, and each node in the tree represents a set element

Parent representation: children point to their parents

Using array storage

#### Set operation

Find operation: find the set where an element is located (represented by the root node)

Union of sets:

• Find the root node of the set tree where x1 and x2 elements are located respectively
• If they are different roots, set the parent node pointer of one root node to the array subscript of the other root node

In order to improve the efficiency of searching after merging, small sets can be merged into relatively large sets

#include<stdio.h>
#include<iostream>

using namespace std;

typedef int element;
typedef struct {
element data;
int parent;
}settype;
const int maxsize=1000;
int find(settype s[], element x)
{//Find the collection to which the element with value x belongs in the array
//maxsize is a global variable and is the maximum length of array s
int i;
for (i = 0; i < maxsize && s[i].data != x; i++);
if (i >= maxsize)
for (; s[i].parent >= 0; i = s[i].parent);//When the value of parents is equal to - 1, the root node is found
return i;//Find the set to which x belongs, and return the subscript of the tree root node in the array s
}

void union_set(settype s[], element x1, element x2)
{
int root1, root2;
root1 = find(s, x1);
root2 = find(s, x2);
if (root1 != root2)
{
s[root2].parent = root1;
}
}


#### Path in heap

Insert a series of given numbers into a small top heap h [] that is initially empty, and then print the path from H[i] to the root node for any given subscript I

The heap is often represented in the form of an array

#include<iostream>
#include<stdio.h>

#define MAXN 1001
#define MINH -10001

using namespace std;

int h[MAXN];
int h_size;

//Heap initialization
void create()
{
h_size = 0;
h = MINH;
//Set up sentry posts
}

void insert(int x)
{
int i;
//Because the position of 0 has been put into a minimum value, the subscript will not exceed 1
for (i = ++h_size; h[i / 2] > x; i /= 2)
h[i] = h[i / 2];
h[i] = x;
}

int main()
{
int n, m, i, j,x;
create();
for (i = 0; i < n; i++)
{
cin >> x ;
insert(x);
}
for (i = 0; i < m; i++)
{
cin >> j;
cout << h[j];
while (j > 1)
{
j /= 2;
cout << h[j];
}
cout << endl;
}
return 0;
}


Posted on Sun, 10 Oct 2021 08:36:00 -0400 by Canman2005