Summary of Ultra-Detailed Production-Level Load Balancing Algorithms

Detailed production-level load balancing algorithm

Introduction to load balancing

Load Balance, the English name for Load Balance, refers to a collection of servers that are symmetrically composed of multiple servers, each of which has an equivalent status and can be served independently without the assistance of other servers.
With a load-sharing technique, requests sent from outside are evenly distributed to a server in a symmetric structure, while the server receiving the request responds independently to the client's request.
Load balancing can evenly distribute client requests to server arrays, providing fast access to critical data and solving large number of concurrent access service problems. This clustering technology can achieve near-mainframe performance with minimal investment.

Load Balancing

Load balancing is divided into software load balancing and hardware load balancing
Suggest that friends who have no experience using the software do not get too entangled with their differences, but continue to look at I

Software Load Balancing

Common load balancing software are Nginx, LVS, HAProxy.
The comparison of the features of these software is not the focus of this article, so this article will not elaborate more.

Hardware Load Balancing

Common load balancing hardware are Array, F5.

Load Balancing Algorithms

Common load balancing algorithms are: random algorithm, weighted polling, consistency hash, minimum active number algorithm.
Don't assume that these algorithms are all very simple, but they are actually more complex to use in production than you think.

Algorithmic Prerequisites

Define a list of servers from which each load balancing algorithm picks a server as the result of the algorithm.

public class ServerIps {
    private static final List<String> LIST = Arrays.asList(
            "192.168.0.1",
            "192.168.0.2",
            "192.168.0.3",
            "192.168.0.4",
            "192.168.0.5",
            "192.168.0.6",
            "192.168.0.7",
            "192.168.0.8",
            "192.168.0.9",
            "192.168.0.10"
    );
}

RandomLoadBalance, a random algorithm

Let's start with the simplest implementation.

public class Random {
    public static String getServer() {
        //Generate a random number as the subscript value of the list
        java.util.Random random = new java.util.Random();
        int randomPos = random.nextInt(ServerIps.LIST.size());
        return ServerIps.LIST.get(randomPos);
    }
    public static void main(String[] args) {
        //10 consecutive calls
        for (int i=0; i<10; i++) {
            System.out.println(getServer());
        }
    }
}
Run result:
192.168.0.3
192.168.0.4
192.168.0.7
192.168.0.1
192.168.0.2
192.168.0.7
192.168.0.3
192.168.0.9
192.168.0.1
192.168.0.1

Random may produce a more centralized random number when the number of calls is small, where most requests fall on the same server and can only be "evenly" distributed after multiple requests.It doesn't matter if there are fewer calls. Is the load balancing mechanism designed to cope with more requests, so random algorithms are also used more often.

However, the above random algorithm can be applied to machines with similar performance every day. In fact, some machines may perform a little better in production. It can handle more requests, so we can set a weight on each server.
Add the server weight to the ServerIps class corresponding to the MAP, and the sum of the weights is 50:

public static final Map<String, Integer> WEIGHT_LIST = new HashMap<String, Integer>();
    static {
        //The sum of weights is 50
        WEIGHT_LIST.put("192.168.0.1", 1);
        WEIGHT_LIST.put("192.168.0.2", 8);
        WEIGHT_LIST.put("192.168.0.3", 3);
        WEIGHT_LIST.put("192.168.0.4", 6);
        WEIGHT_LIST.put("192.168.0.5", 5);
        WEIGHT_LIST.put("192.168.0.6", 5);
        WEIGHT_LIST.put("192.168.0.7", 4);
        WEIGHT_LIST.put("192.168.0.8", 7);
        WEIGHT_LIST.put("192.168.0.9", 2);
        WEIGHT_LIST.put("192.168.0.10", 9);
    }

Now the random algorithm should be changed to the weighted random algorithm. When more calls are made, the distribution used by the server should approximate the distribution of corresponding weights.

Weighted Random Algorithm

The simple implementation idea is to copy each server according to its corresponding server and make the code easier to understand

public class WeightRandom {
    public static String getServer() {
        //Generate a random number as the subscript value of the list
        List<String> ips = new ArrayList<String>();
        for (String ip : ServerIps.WEIGHT_LIST.keySet()) {
            Integer weight = ServerIps.WEIGHT_LIST.get(ip);
            //Copy by weight
            for (int i=0; i<weight; i++) {
                ips.add(ip);
            }
        }
        java.util.Random random = new java.util.Random();
        int randomPos = random.nextInt(ips.size());
        return ips.get(randomPos);
    }
    public static void main(String[] args) {
        //10 consecutive calls
        for (int i=0; i<10; i++) {
            System.out.println(getServer());
        }
    }
}
Run result:
192.168.0.8
192.168.0.2
192.168.0.7
192.168.0.10
192.168.0.8
192.168.0.8
192.168.0.4
192.168.0.7
192.168.0.6
192.168.0.8

This implementation consumes memory when the sum of weights is especially large, because ip addresses need to be copied. The larger the sum of weights, the more memory the ips above need. Here is another implementation idea.

Suppose we have a set of server servers = [A, B, C], whose corresponding weights are weights = [5, 3, 2], and the sum of weights is 10.Now spread these weights on one-dimensional coordinate values, [0,5] for Server A, [5,8] for Server B, and [8,10] for Server C.Next, a random number with a range between [0, 10] is generated by the random number generator, and then the interval to which the random number falls is calculated.For example, the number 3 will fall into the corresponding range of server A, and then return to server A.The larger the weight of the machine, the larger the corresponding interval range on the coordinate axis, so the number generated by the random number generator will have a greater probability of falling within this interval.As long as the random number generated by the random number generator is well distributed, after multiple selections, the proportion of times each server is selected approximates its weight ratio.For example, after 10,000 selections, Server A is selected about 5,000 times, Server B about 3,000 times, and Server C about 2,000 times.
Assume that now the random number offset=7:

  1. Offset < 5 is false, so it is not in the [0,5] interval, offset = offset - 5 (offset=2)

  2. Offset < 3 is true, so it's in [5,8] range, so B server should be chosen
    Implement as follows

public class WeightRandomV2 {
    public static String getServer() {
        int totalWeight = 0;
        boolean sameWeight = true; //If all weights are equal, then a random ip would be fine
        Object[] weights = ServerIps.WEIGHT_LIST.values().toArray();
        for (int i = 0; i < weights.length; i++) {
            Integer weight = (Integer) weights[i];
            totalWeight += weight;
            if (sameWeight && i > 0 && !weight.equals(weights[i - 1])) {
                sameWeight = false;
            }
        }
        java.util.Random random = new java.util.Random();
        int randomPos = random.nextInt(totalWeight);
        if (!sameWeight) {
            for (String ip : ServerIps.WEIGHT_LIST.keySet()) {
                Integer value = ServerIps.WEIGHT_LIST.get(ip);
                if (randomPos < value) {
                    return ip;
                }
                randomPos = randomPos - value;
            }
        }
        return (String) ServerIps.WEIGHT_LIST.keySet().toArray()[new java.util.Random().nextInt(ServerIps.WEIGHT_LIST.size())];
    }
    public static void main(String[] args) {
        //10 consecutive calls
        for (int i = 0; i < 10; i++) {
            System.out.println(getServer());
        } 
    }
}

This is another weighted random algorithm.

Polling algorithm-RoundRobinLoadBalance

A simple polling algorithm is simple

public class RoundRobin {
    //The location of the current loop
    private static Integer pos = 0;
    public static String getServer() {
        String ip = null;
        //pos synchronization
        synchronized (pos) {
            if (pos >= ServerIps.LIST.size()) {
                pos = 0;
            }
            ip = ServerIps.LIST.get(pos);
            pos++;
        }
        return ip;
    }
    public static void main(String[] args) {
        //10 consecutive calls
        for (int i = 0; i < 11; i++) {
            System.out.println(getServer());
        }
    }
}
Run result:
192.168.0.1
192.168.0.2
192.168.0.3
192.168.0.4
192.168.0.5
192.168.0.6
192.168.0.7
192.168.0.8
192.168.0.9
192.168.0.10
192.168.0.1

This algorithm is very simple and fair. Every service takes turns to perform services, but it has good machine performance. So it can do more work, like random algorithms. After adding the dimension of weight, one of the implementation methods is replication, which is not demonstrated here. The disadvantage of this replication algorithm is the same as that of random algorithms. It consumes memory, so it will be natural.Other implementations.Let me introduce an algorithm below:

This algorithm requires the addition of a concept: the call number, such as 1 for the first call, 2 for the second call, 100 for the 100th call, is incremental, so we can deduce the server from this call number.

Suppose we have three servers = [A, B, C], with weights = [2, 5, 1],The total weight is 8. We can understand that there are 8 "servers", which are 8 "without concurrency". Two of them are A, five are B, and one is C. When one call comes in, it needs to be accessed in sequence, such as 10 calls. Then the server call order is A A B B B C A A, the call number will get bigger and bigger, while the server is fixed, so the call number needs to be fixed.Reduce, where the call number is balanced by dividing by the sum of the total weights, such as:

  • Call number 1, 1%8=1;

  • Call 2, 2%8=2;

  • Call No. 3, 3%8=3;

  • Call 8, 8%8=0;

  • Call 9, 9%8=1;

  • Call number 100, 100%8=4;
    We found that the call number can be reduced to 8 numbers between 0 and 7. The question is how can we find the corresponding server based on these 8 numbers?Similar to our random algorithm, here we can also think of weights as a coordinate axis of "0----2----7----8"

  • Call number 1, 1%8=1, offset=1, offset <= 2 is true, take A;

  • Call 2, 2%8=2; offset = 2, offset <= 2 is true, take A;

  • Call 3, 3%8=3; offset = 3, offset <= 2 is false, offset = offset - 2, offset = 1, offset <= 5, take B

  • Call 8, 8%8=0; offset = 0, special case, offset = 8, offset <= 2 is false, offset = offset - 2, offset = 6, offset <= 5 is false, offset = offset - 5, offset = 1, offset <= 1 is true, take C;

  • Call 9, 9%8=1; //...

  • Call No. 100, 100%8=4; //...
    Realization:
    Analog call number acquisition tool:

public class Sequence {
    public static Integer num = 0;
    public static Integer getAndIncrement() {
        return ++num;
    }
}
public class WeightRoundRobin {
    private static Integer pos = 0;
    public static String getServer() {
        int totalWeight = 0;
        boolean sameWeight = true; //If all weights are equal, then a random ip would be fine
        Object[] weights = ServerIps.WEIGHT_LIST.values().toArray();
        for (int i = 0; i < weights.length; i++) {
            Integer weight = (Integer) weights[i];
            totalWeight += weight;
            if (sameWeight && i > 0 && !weight.equals(weights[i - 1])) {
                sameWeight = false;
            }
        }
        Integer sequenceNum = Sequence.getAndIncrement();
        Integer offset = sequenceNum % totalWeight;
        offset = offset == 0 ?  totalWeight : offset;
        if (!sameWeight) {
            for (String ip : ServerIps.WEIGHT_LIST.keySet()) {
                Integer weight = ServerIps.WEIGHT_LIST.get(ip);
                if (offset <= weight) {
                    return ip;
                }
                offset = offset - weight;
            }
        }
        String ip = null;
        synchronized (pos) {
            if (pos >= ServerIps.LIST.size()) {
                pos = 0;
            }
            ip = ServerIps.LIST.get(pos);
            pos++;
        }
        return ip;
    }
    public static void main(String[] args) {
        //11 consecutive calls
        for (int i = 0; i < 11; i++) {
            System.out.println(getServer());
        }
    }
}
Run result:
192.168.0.1
192.168.0.2
192.168.0.2
192.168.0.2
192.168.0.2
192.168.0.2
192.168.0.2
192.168.0.2
192.168.0.2
192.168.0.3
192.168.0.3

But this algorithm has one disadvantage: when a server has a very large weight, it needs to process requests continuously, but in fact, what we want to achieve is that for 100 requests, as long as there are 100*8/50=16, the 16 visits are not necessarily continuous, for example, if we have three servers = [A, B, C], the corresponding weight is weights = [5,1, 1], with a total weight of 7, the result of this algorithm is: AAAAABC, so if it can be such a result: AABACAA, which inserts B and C in the middle of five A equally, it is more balanced.

We can change to smooth weighted polling here.

Smooth Weighted Polling

Idea: Each server has two weights, weight and currentWeight.Where weight is fixed and currentWeight is dynamically adjusted to an initial value of 0.When a new request comes in, iterate through the server list so that its currentWeight adds its own weight.Once the traversal is complete, find the largest currentWeight, subtract the sum of the weights, and return to the appropriate server.

As above, after smoothing, the resulting server sequence is [A, A, B, A, C, A, A], which is better distributed than the previous sequence [A, A, A, B, C].Initially, currentWeight = [0, 0, 0], after the seventh request is processed, currentWeight becomes [0, 0, 0] again.

Realization:

//Add a Weight class to hold ip, weight (fixed original weight), currentw8 (currently changing weight)
public class Weight {
    private String ip;
    private Integer weight;
    private Integer currentWeight;
    public Weight(String ip, Integer weight, Integer currentWeight) {
        this.ip = ip;
        this.weight = weight;
        this.currentWeight = currentWeight;
    }
    public String getIp() {
        return ip;
    }
    public void setIp(String ip) {
        this.ip = ip;
    }
    public Integer getWeight() {
        return weight;
    }
    public void setWeight(Integer weight) {
        this.weight = weight;
    }
    public Integer getCurrentWeight() {
        return currentWeight;
    }
    public void setCurrentWeight(Integer currentWeight) {
        this.currentWeight = currentWeight;
    }
}
public class WeightRoundRobinV2 {
    private static Map<String, Weight> weightMap = new HashMap<String, Weight>();
    public static String getServer() {
        // java8
        int totalWeight = ServerIps.WEIGHT_LIST.values().stream().reduce(0, (w1, w2) -> w1+w2);
        //Initialize weightMap, initially assigning currentWeight to weight
        if (weightMap.isEmpty()) {
            ServerIps.WEIGHT_LIST.forEach((key, value) -> {
                weightMap.put(key, new Weight(key, value, value));
            });
        }
        //Find the maximum currentWeight
        Weight maxCurrentWeight = null;
        for (Weight weight : weightMap.values()) {
            if (maxCurrentWeight == null || weight.getCurrentWeight() > maxCurrentWeight.getCurrentWeight()) {
                maxCurrentWeight = weight;
            }
        }
        //Subtract maxCurrentWeight from total weight sum
        maxCurrentWeight.setCurrentWeight(maxCurrentWeight.getCurrentWeight() - totalWeight);
        //Unified current weights plus original weights for all IPS
        for (Weight weight : weightMap.values()) {
           weight.setCurrentWeight(weight.getCurrentWeight() + weight.getWeight());
        }
        //Returns the ip corresponding to maxCurrentWeight
        return maxCurrentWeight.getIp();
    }
    public static void main(String[] args) {
        //10 consecutive calls
        for (int i = 0; i < 10; i++) {
            System.out.println(getServer());
        }
    }
}

To simplify the data in ServerIps is:

WEIGHT_LIST.put("A", 5);
        WEIGHT_LIST.put("B", 1);
        WEIGHT_LIST.put("C", 1);
        ```
Run result:
A
A
B
A
C
A
A
A
A
B

This is the polling algorithm. A loop is simple, but you need to think more about it in the actual application.

ConsistentHash LoadBalance

When a server cluster receives a request call, it can hash according to the information requested, such as the ip address of the client, or the request path and request parameters. It can get a hash value, which is characterized by the same ip address, or the hash value of the request path and request parameters. As long as another algorithm can be added, the hash can be madeMapping a value to a server-side ip address allows the same request (the same ip address, or request path and request parameters) to fall on the same server.
Since client-initiated requests are endless (client addresses are different, request parameters are different, etc.), the hash value is also infinite, so it is not possible to map all the hash values to the server-side ip, so a hash ring is needed here.As follows:

* Hash value If between ip1 and ip2 is required, ip2 should be selected as the result;
* Hash value If between ip2 and ip3 is required, ip3 should be selected as the result;
* If a hash value between ip3 and ip4 is required, ip4 should be selected as the result;
* Hash value If between ip4 and ip1 is required, ip1 should be selected as the result;

This is a fairly even situation, and if an ip4 server does not exist, that's it:

It will be found that the direct range of ip3 and IP1 is relatively large, and more requests will fall on ip1, which is not "fair". To solve this problem, virtual nodes need to be added, such as:

Among them, ip2-1 and ip3-1 are virtual nodes, which cannot handle nodes, but are equivalent to corresponding IP2 and IP3 servers.

In fact, this is just a way of thinking about dealing with this imbalance. In fact, even if the hash ring itself is balanced, you can add more virtual nodes to make the ring smoother, for example:

This color ring is also "fair" and only ip1,2,3,4 is the actual server IP and the rest are virtual ips.

So how do we do that?

For our service-side ip addresses, we certainly know how many total, how many virtual nodes we need to have our own control. The more virtual nodes there are, the more traffic will be balanced. The hash algorithm is also critical, and the more hash algorithm the more traffic will be balanced.
Realization:

public class ConsistentHash {
    private static SortedMap<Integer, String> virtualNodes = new TreeMap<>();
    private static final int VIRTUAL_NODES = 160;
    static {
        //Add virtual nodes to each real node and the virtual nodes will be hashed according to the hash algorithm
        for (String ip : ServerIps.LIST) {
            for (int i = 0; i < VIRTUAL_NODES; i++) {
                int hash = getHash(ip+"VN"+i);
                virtualNodes.put(hash, ip);
            }
        }
    }
    private static String getServer(String client) {
        int hash = getHash(client);
        //Get an ordered Map that is larger than the Hash value
        SortedMap<Integer, String> subMap = virtualNodes.tailMap(hash);
        //The position of the first element greater than the hash value
        Integer nodeIndex = subMap.firstKey();
        //Returns the root node if there is no element greater than the hash value
        if (nodeIndex == null) {
            nodeIndex = virtualNodes.firstKey();
        }
        //Returns the corresponding virtual node name
        return subMap.get(nodeIndex);
    }
    private static int getHash(String str) {
        final int p = 16777619;
        int hash = (int) 2166136261L;
        for (int i = 0; i < str.length(); i++)
            hash = (hash ^ str.charAt(i)) * p;
        hash += hash << 13;
        hash ^= hash >> 7;
        hash += hash << 3;
        hash ^= hash >> 17;
        hash += hash << 5;
        //If the calculated value is negative, take its absolute value
        if (hash < 0)
            hash = Math.abs(hash);
        return hash;
    }
    public static void main(String[] args) {
        //10 consecutive calls, 10 client s at random
        for (int i = 0; i < 10; i++) {
            System.out.println(getServer("client" + i));
        }
    }
}

Least Active LoadBalance algorithm

The primary goal of the first few methods is to make the number of calls allocated to the server as balanced as possible, but is this the case?Is the load on the server balanced with the same number of calls?Of course not, this also takes into account the time of each call, and the minimum active number algorithm solves this problem.

The smaller the number of active calls, the more efficient the service provider is and more requests can be processed per unit time.The request should be assigned to the service provider first.In the implementation, each service provider corresponds to an active number.Initially, all service providers are active at 0.For each request received, the number of activities is increased by 1, and after the request is completed, the number of activities is reduced by 1.The basic idea of the minimum active number load balancing algorithm is that a service provider with good performance can process requests faster and therefore decrease activity more quickly after a service has been running for some time, when such a service provider can obtain new service requests first.In addition to the minimum active number, the minimum active number algorithm also introduces a weight value in its implementation.So, accurately, the minimum active number algorithm is based on the weighted minimum active number algorithm.As an example, there are two service providers with excellent performance in a cluster of service providers.If they have the same number of activities at a given time, the requests are allocated according to their weights, and the larger the weight, the greater the probability of new requests being obtained.If the two service providers have the same weight, then choose one at random.

Realization:

Since the number of activities is a logical combination of server request processing, the number of activities at the beginning of a call is + 1, and the number of activities at the end is -1, this logic is not simulated here, but is simulated using a map directly.

//Current number of active servers
    public static final Map<String, Integer> ACTIVITY_LIST = new LinkedHashMap<String, Integer>();
    static {
        ACTIVITY_LIST.put("192.168.0.1", 2);
        ACTIVITY_LIST.put("192.168.0.2", 0);
        ACTIVITY_LIST.put("192.168.0.3", 1);
        ACTIVITY_LIST.put("192.168.0.4", 3);
        ACTIVITY_LIST.put("192.168.0.5", 0);
        ACTIVITY_LIST.put("192.168.0.6", 1);
        ACTIVITY_LIST.put("192.168.0.7", 4);
        ACTIVITY_LIST.put("192.168.0.8", 2);
        ACTIVITY_LIST.put("192.168.0.9", 7);
        ACTIVITY_LIST.put("192.168.0.10", 3);
    }
public class LeastActive {
    private static String getServer() {
        //Find the server with the lowest current activity
        Optional<Integer> minValue = ServerIps.ACTIVITY_LIST.values().stream().min(Comparator.naturalOrder());
        if (minValue.isPresent()) {
            List<String> minActivityIps = new ArrayList<>();
            ServerIps.ACTIVITY_LIST.forEach((ip, activity) -> {
                if (activity.equals(minValue.get())) {
                    minActivityIps.add(ip);
                }
            });
            //If there are more than one ip for the minimum active number, it is selected according to the weight, and the higher one takes precedence
            if (minActivityIps.size() > 1) {
                //Filter out the corresponding ip and weight
                Map<String, Integer> weightList = new LinkedHashMap<String, Integer>();
                ServerIps.WEIGHT_LIST.forEach((ip, weight) -> {
                    if (minActivityIps.contains(ip)) {
                        weightList.put(ip, ServerIps.WEIGHT_LIST.get(ip));
                    }
                });
                int totalWeight = 0;
                boolean sameWeight = true; //If all weights are equal, then a random ip would be fine
                Object[] weights = weightList.values().toArray();
                for (int i = 0; i < weights.length; i++) {
                    Integer weight = (Integer) weights[i];
                    totalWeight += weight;
                    if (sameWeight && i > 0 && !weight.equals(weights[i - 1])) {
                        sameWeight = false;
                    }
                }
                java.util.Random random = new java.util.Random();
                int randomPos = random.nextInt(totalWeight);
                if (!sameWeight) {
                    for (String ip : weightList.keySet()) {
                        Integer value = weightList.get(ip);
                        if (randomPos < value) {
                            return ip;
                        }
                        randomPos = randomPos - value;
                    }
                }
                return (String) weightList.keySet().toArray()[new java.util.Random().nextInt(weightList.size())];
            } else {
                return minActivityIps.get(0);
            }
        } else {
            return (String) ServerIps.WEIGHT_LIST.keySet().toArray()[new java.util.Random().nextInt(ServerIps.WEIGHT_LIST.size())];
        }
    }
    public static void main(String[] args) {
        //10 consecutive calls, 10 client s at random
        for (int i = 0; i < 10; i++) {
            System.out.println(getServer());
        }
    }
}

Because there is no manipulation on the number of activities, the results are fixed (random when assuming random weights, depending on the source implementation and the results of the operation).

If in doubt, please leave a message in the comments area and learn to communicate with each other


Tags: Java Load Balance Nginx REST

Posted on Mon, 18 May 2020 12:36:53 -0400 by launchcode