# Algorithm review

## Basics

1. Complexity calculation
2. Solving recurrences

1. TSP

2. KMP

## Divide and conquer

### thought

Divide similar subproblems, solve subproblems recursively, and then merge subproblems

### Typical problems

1. Maximum sub segment sum

• Subproblem

Solve the left and right largest fields and

Solve the maximum field sum across both sides

• code implementation

private static int maxSum(int[] arr, int left, int right) {
int sum = 0;
if (left == right) {
sum = arr[left];
} else {
int mid = (left + right) / 2;
//Left sub question: what is the maximum sum of consecutive numbers on the left
int leftSum = maxSum(arr, left, mid);
//Right sub question: what is the maximum sum of consecutive numbers on the right
int rightSum = maxSum(arr, mid + 1, right);

//The following is the understanding of solving the maximum sum of subsequences across the middle
//The necessary condition is that the element where the split line is located and the left or right side of the split line must be continuous, and the sum of numbers is meaningful
//Therefore, it can be agreed that the elements at m ± 1 are in the same sequence
//Therefore, separate calculation is reasonable, which can ensure the above premise
//If you start from the left endpoint or the right endpoint, you can't guarantee that the calculated maximum and mapped subsequence contains numbers across two parts

//What is the maximum sum of consecutive numbers ending with the first number on the left
int temp = 0;
int lSum = 0;
for (int i = mid; i >= left; i--) {
temp += arr[i];
lSum = Math.max(lSum, temp);
}

//What is the continuous number starting from the first number on the right and the maximum
temp = 0;
int rSum = 0;
for (int i = mid + 1; i <= right; i++) {
temp += arr[i];
rSum = Math.max(rSum, temp);
}
sum = Math.max(Math.max(leftSum, rightSum), rSum + lSum);
}
return sum;
}

2. Recently on

The data needs to be sorted by x

• Subproblem

Solve the nearest pair on the left and right sides

Take the points within the middle left and right x-axes according to the nearest distance, sort them according to the y-axis, and calculate the shortest distance between each point and the point on the other side of the split line

• code

public static double doDivide(Point[] points, int start, int end) {
int size = end - start + 1;
if (size == 2) {
return points[start].distanceOf(points[end]);
}
int m = start + size/2;
int xm = points[m].x;
double dis = Math.min(doDivide(points, start, m), doDivide(points, m, end));
List<Point> p1 = new ArrayList<>(size);
List<Point> p2 = new ArrayList<>(size);
for (int i = m; i>=start; i--) {
if (xm - points[i].x < dis) {
} else {
break;
}
}
for (int i = m + 1; i<=end; i++) {
if (points[i].x - xm < dis) {
} else {
break;
}
}
if (p1.isEmpty()) {
return dis;
}
p1.sort(Comparator.comparingInt(p->p.y));
p2.sort(Comparator.comparingInt(p->p.y));
for (int i = 0; i < p1.size(); i++) {
Point p1p = p1.get(i);
int j = i < p2.size() ? i : p2.size() - 1;
//search up
for (; j < p2.size(); j++) {
Point p2p = p2.get(j);
if (Math.abs(p2p.y - p1p.y) >= dis) {
break;
} else {
dis = Math.min(dis, p2p.distanceOf(p1p));
}
}
//search down
for (; j >= 0; j--) {
Point p2p = p2.get(j);
if (Math.abs(p2p.y - p1p.y) >= dis) {
break;
} else {
dis = Math.min(dis, p2p.distanceOf(p1p));
}
}
}
return dis;
}

public static double divided(Point[] points) {
Arrays.sort(points, Comparator.comparingInt(p -> p.x));
return doDivide(points, 0, points.length-1);
}


## Reduction treatment

### thought

Divide similar subproblems, recursively solve some molecular problems, and discard unqualified subproblems

### Typical problems

1. Big root pile

Delete: swap with the last element and adjust downward

2. Counterfeit money problem

Take the mold according to 3 and divide it into 3 piles. If the multiple is less than 3, it will be rounded up into a multiple of 3

3. Binary lookup tree

## dynamic programming

### The connection and difference between and greedy law

They are all divided into subproblems to find the local optimal solution and derive the global optimal solution. The greedy subproblems do not overlap, and the dynamic programming subproblems overlap each other

### Design ideas and steps

The problem to be solved is divided into multiple overlapping subproblems, and each subproblem depends on the previous subproblem.

1. Partition subproblem
2. Determining dynamic programming function
3. fill out a form

### Typical problems

1. Multi segment diagram:

• The optimality principle proves that let s,s1,s2... t be the shortest path of S - > t. let s - > S1 have been solved, then the problem becomes the shortest path problem of s1,s2... t. there must be a shortest path in this problem, otherwise there is a contradiction between the shortest path of s,s1,r1,r2... t and the premise. Therefore, the optimality principle is satisfied

• Dynamic programming function:
{ d ( s , v ) = c s v < s , v > ∈ E d ( s , v ) = m i n { d ( s , u ) + c u v } < s , u > ∈ E \begin{cases} d(s,v)=c_{sv} & <s,v>\in E \\ d(s,v)=min\{d(s,u)+c_{uv}\} & <s,u>\in E \end{cases} {d(s,v)=csv​d(s,v)=min{d(s,u)+cuv​}​<s,v>∈E<s,u>∈E​

2. TSP: (P108)

• The optimality principle proves that if s,s1,s2... S is a shortest loop, if s - > S1 is known, then s1,s2... S constitutes a shortest path of S1 - > s, otherwise s1,r1,r2... S is a shortest path passing through n-1 cities, then s,s1,r1,r2... S is the shortest loop, which contradicts the premise. Therefore, the optimality principle is satisfied

• Dynamic programming function:
{ d ( k , { } ) = c k i d ( i , V ′ ) = m i n { c i k + d ( k , V ′ − { k } ) } k ∈ V ′ \begin{cases} d(k,\{\})=c_{ki} \\ d(i,V')=min\{c_{ik}+d(k,V'-\{k\})\} & k \in V' \end{cases} {d(k,{})=cki​d(i,V′)=min{cik​+d(k,V′−{k})}​k∈V′​

• Time complexity O(2^n)

3. Longest common subsequence:

• The optimality principle proves that the two sequences X and Y have the longest common subsequence Z, and Z contains the longest common subsequence of XY prefix sequence, so it satisfies the optimality principle

• Dynamic programming function:
L ( i , j ) = { L ( i − 1 , j − 1 ) + 1 x i = y j , i ≥ 1 , j ≥ 1 m a x { L ( i − 1 , j ) , L ( i , j − 1 ) } x i ≠ y j , i ≥ 1 , j ≥ 1 L(i,j)= \begin{cases} L(i-1,j-1)+1 & x_i = y_j,i \geq 1, j \geq 1 \\ max\{L(i-1,j),L(i,j-1)\} & x_i \neq y_j,i \geq 1, j \geq 1 \end{cases} L(i,j)={L(i−1,j−1)+1max{L(i−1,j),L(i,j−1)}​xi​=yj​,i≥1,j≥1xi​​=yj​,i≥1,j≥1​

• code implementation

public class MaxPublicString {
static int[][] record;

public static int findPubStr(String p, String t) {
p = " " + p;
t = " " + t;
int[][] dp = new int[p.length()][t.length()];
record = new int[p.length()][t.length()];
for (int i = 1; i < p.toCharArray().length; i++) {
for (int j = 1; j < t.toCharArray().length; j++) {
if (p.charAt(i) == t.charAt(j)) {
dp[i][j] = dp[i - 1][j - 1] + 1;
record[i][j] = 1;
} else if (dp[i - 1][j] > dp[i][j - 1]) {
dp[i][j] = dp[i-1][j];
record[i][j] = 3;
} else {
dp[i][j] = dp[i][j-1];
record[i][j] = 2;
}
}
}
return dp[p.length() - 1][t.length() - 1];
}

public static String getPubStr(String p) {
StringBuilder res = new StringBuilder();
for (int i = record.length - 1; i > 0 ;) {
int[] nextRec = record[i];
for (int j = nextRec.length - 1; j > 0 ;) {
if (record[i][j] == 1) {
res.append(p.charAt(i-1));
i--;
j--;
} else if (record[i][j] == 2) {
j--;
} else {
i--;
}
}
}
return res.reverse().toString();
}
}

1. 0 / 1 backpack: (currency exchange problem)

• Optimality principle: if x1,x2... xn is an optimal solution, then x2... xn is the optimal solution of a subproblem, otherwise y1,y1... yn is the optimal solution of the subproblem and is better than X1... xn, resulting in contradiction, so it satisfies the optimality principle

• Dynamic programming function:
V ( i , j ) = { V ( i − 1 , j ) j < w i m a x { V ( i − 1 , j ) , V ( i − 1 , j − w i ) + v i } j ≥ w i V(i,j)= \begin{cases} V(i-1,j) & j < w_i \\ max\{V(i-1,j),V(i-1,j-w_i)+v_i\} & j \geq w_i \end{cases} V(i,j)={V(i−1,j)max{V(i−1,j),V(i−1,j−wi​)+vi​}​j<wi​j≥wi​​

2. Approximate string matching:

• Optimality principle: if sample P has the optimal correspondence at text T, the correspondence between any substring of P and t is also optimal, so it meets the optimality principle

• Dynamic programming function:
D ( i , j ) = { m i n { D ( i − 1 , j − 1 ) , D ( i − 1 , j ) , D ( i , j − 1 ) } i > 0 , j > 0 , p i ≠ t j m i n { D ( i − 1 , j − 1 ) + 1 , D ( i − 1 , j ) + 1 , D ( i , j − 1 ) + 1 } i > 0 , j > 0 , p i = t j D(i,j)= \begin{cases} min\{D(i-1,j-1),D(i-1,j),D(i,j-1)\} & i>0,j>0,p_i \neq t_j\\ min\{D(i-1,j-1)+1,D(i-1,j)+1,D(i,j-1)+1\} & i>0,j>0,p_i = t_j \end{cases} D(i,j)={min{D(i−1,j−1),D(i−1,j),D(i,j−1)}min{D(i−1,j−1)+1,D(i−1,j)+1,D(i,j−1)+1}​i>0,j>0,pi​​=tj​i>0,j>0,pi​=tj​​

## Greedy

### Typical problems

1. knapsack problem

Maximum value strategy, minimum weight strategy, maximum unit weight strategy

2. minimum spanning tree

Prim,Kruskal

3. TSP

Select the shortest edge each time

4. Graph coloring problem

Select a color to shade as many vertices as possible until complete conflict, change to the next color, and cycle through until complete shading

5. Multi machine scheduling problem

Jobs with the longest processing time are processed first, and the tasks with the longest processing time are assigned to idle machines

## Backtracking method

### Solution space tree

All possible solutions found according to the access order of objects constitute a solution space book

### design idea

The solution space tree is searched deeply, and the nodes (constraints) that do not contain the optimal solution are skipped for pruning

### Typical problems

1. Hamiltonian loop

The constraint condition is that there needs to be an edge between the two vertices, and each point can only be traversed once except the starting point

The hierarchy of the solution space tree is determined according to the number of points, and there are several branches in the first layer of the solution space tree according to the number of unreached points

2. Eight queens

Constraint: queens cannot be in the same column or slash

The level of solution space is determined according to the number of rows, and the number of branches in each layer of solution space tree is determined according to the number of columns

3. 0 / 1 knapsack problem

Constraints: the weight of the item is less than the capacity of the backpack

There are only two branches in each layer of the solution space tree, and the level is determined according to the number of items

4. Batch scheduling

Constraints: a task is executed only once on a machine

The solution space tree is the total arrangement of tasks

## Branch and bound method

1. Multi segment graph

• Upper bound: the greedy method selects the shortest edge to go each time

• Limit function:
l b = ∑ j = 1 i c [ r i ] [ r j + 1 ] + m i n < r r + 1 , v p > ∈ E { c [ r i + 1 ] [ v p ] } + ∑ j = i + 2 k Shortest side of segment j LB = \ sum {J = 1} ^ {I} C [R {I] [R {j + 1}] + min {< R {R + 1}, V _p > \ in E} \ {C [R {I + 1}] [v _p] \} + \ sum {J = I + 2} ^ {K} \ text {the shortest side of segment j} lb=j=1 ∑ i {c[ri] [rj+1] + min < RR + 1, vp > ∈ E {c[ri+1] [vp]} + j=i+2 ∑ k the shortest side of section j
Explanation: solved path length + shortest edge of the last solved point + shortest edge in all remaining segments

Nodes larger than the upper bound can be discarded. The search space tree level is related to the number of segments, and the branches of each segment are related to the number of nodes in the segment

2. 0 / 1 Backpack

• Lower bound: fill the backpack with greedy method after sorting by unit weight

• Limit function:
u b = v + ( W − w ) × ( v i + 1 / w i + 1 ) ub=v+(W-w)\times(v_{i+1}/w_{i+1}) ub=v+(W−w)×(vi+1​/wi+1​)
Explanation: the product of the obtained value + the remaining Backpack Capacity and the remaining maximum unit value

If the solution exceeds the knapsack capacity, the root node of the solution space tree is in the empty knapsack state, and there are only two branches in each layer, representing item take and not take

• Upper bound: the greedy method takes the allocation with the shortest completion time of remaining tasks from the first person

• Limit function:
l b = v + ∑ k = i + 1 n Line k minimum LB = V + \ sum {k = I + 1} ^ {n} \ text {minimum value of line k} lb=v+k=i+1 Σ n minimum value of line k
Explanation: sum of cost spent + minimum completion time of remaining personnel (tasks can be repeated during calculation)

Nodes that exceed the upper bound can discard tasks and cannot be allocated repeatedly. The cost spent in the root node of the solution space tree = 0. The level of the solution space tree is related to the number of personnel, and the number of branches in each layer is related to the number of tasks. Each node indicates that the task is completed by someone

4. Batch scheduling

• Upper bound: the upper bound scheme is obtained by allocating the processing time of the last machine to the task from large to small

• Bounded function: (assuming only three machines)
KaTeX parse error: Undefined control sequence: \and at position 128: ...(\sum_{j\neq u \̲a̲n̲d̲ ̲j \not\in M}t_{...
Explanation: let the assigned task set be m and the quantity be K. in the initial state, M is empty, k = 0, sum1 = 0, sum2 = 0

sum1 = time required for the first machine to process the task currently ready for assignment

sum2 = the time required for the second machine to process the task currently ready for assignment. The running time of the previous machine needs to be comprehensively considered

lb is derived from the running time of the first two machines by the determined waiting time + the total time for the second machine to process unfinished tasks + the shortest processing time for the third machine to process unfinished and non currently assigned tasks. (the "unfinished and not currently assigned" task at the leaf node does not exist, but according to the book description, the task can be the currently assigned task.)

Nodes larger than the upper bound are discarded. The level of solution space tree is related to the number of tasks, and the number of branches is related to the number of remaining unassigned tasks. Each node represents the task u to be assigned, indicating that it has not been added to M.

Posted on Mon, 06 Dec 2021 19:11:20 -0500 by PAFTprod