Kubernetes gRPC load balancing
gRPC is one of the most popular modern RPC frameworks for inter process communication. It is an excellent choice for microservice architecture. Moreover, there is no doubt that the most popular way to deploy microservice applications is Kubernetes.
Kubernetes deployments can have the same back-end instance to service many client requests. Kubernetes' cluster IP service provides load balanced IP addresses. However, this default load balancing does not apply to gRPC out of the box.
If you use gRPC and deploy many backend on Kubernetes, this document is suitable for you to consult.
Why load balancing?
Large scale deployments have many of the same back-end instances and many clients. Each back-end server has a certain capacity. Load balancing is used to distribute the load from clients among available servers.
Before we begin to learn more about gRPC load balancing in Kubernetes, let's take a look at the benefits of load balancing.
Load balancing has many benefits, some of which are:
- Fault tolerance: if a replica fails, other servers can service the request.
- Increased scalability: user traffic can be distributed across multiple servers to increase scalability.
- Improve throughput: you can improve the throughput of applications by distributing traffic to different back-end servers.
- Deployment has no disadvantages: rolling deployment technology can realize non downtime deployment.
Load balancing has many other benefits. You can read more about load balancers here.
Load balancing options for gRPC
There are two types of load balancing options in gRPC -- proxy and client.
Agent load balancing
In agent load balancing, clients send RPC to LB (load Balancer) agents. LB distributes RPC calls to an available back-end server that implements the actual logic that serves the call. LB tracks the load of each backend and implements an algorithm for fair load distribution. The client itself does not know the background server. The client is not trusted. This architecture is usually used for user oriented services, where clients from the open Internet can connect to servers
Client load balancing
In client load balancing, the client knows many back-end servers and selects one back-end server for each RPC. If the client wants to implement the load balancing algorithm based on server load report. For a simple deployment, the client can poll requests between available servers.
For more information about gRPC load balancing options, see the article gRPC load balancing.
Challenges related to gRPC load balancing
gRPC works over HTTP / 2. TCP connections over HTTP / 2 exist for a long time. A connection can multiplex multiple requests. This reduces the overhead associated with connection management. But this also means that connection level load balancing is not very useful. The default load balancing in Kubernetes is based on connection level load balancing. For this reason, Kubernetes' default load balancing does not work with gRPC.
To confirm this assumption, create a Kubernetes application. This application consists of the following three parts
- Server pod: the kubernetes deployment has three gRPC server pods.
- Client Pod: the kubernetes deployment comes with a gRPC client pod.
- Service: clusterip service, select all server side pod s.
Create server deployment
To create a deployment, save the following code in a YAML file, such as deployment-server.yaml, and then run the command kubectl apply -f deployment-server.yaml.
apiVersion: apps/v1 kind: Deployment metadata: name: grpc-server labels: app: grpc-server spec: replicas: 3 selector: matchLabels: app: grpc-server template: metadata: labels: app: grpc-server spec: containers: - name: grpc-server image: techdozo/grpc-lb-server:1.0.0
This will create a GRPC server with three copies. The GRPC server runs 8001 on the port. To verify that the pod was created successfully, run the command kubectl get pods
NAME READY STATUS RESTARTS AGE grpc-server-6c9cd849-5pdbr 1/1 Running 0 1m grpc-server-6c9cd849-86z7m 1/1 Running 0 1m grpc-server-6c9cd849-mw9sb 1/1 Running 0 1m
You can run the command kubectl logs -- follow grpc server - < > to view the log.
To create a service, save the following code in a YAML file, such as service.yaml, and then run the command kubectl apply -f service.yaml.
apiVersion: v1 kind: Service metadata: name: grpc-server-service spec: type: ClusterIP selector: app: grpc-server ports: - port: 80 targetPort: 8001
The ClusterIP service provides load balanced IP addresses. It load balances traffic between pod endpoints that are matched by tag selectors.
Name: grpc-server-service Namespace: default Selector: app=grpc-server Type: ClusterIP IP Family Policy: SingleStack IP Families: IPv4 IP: 10.96.28.234 IPs: 10.96.28.234 Port: <unset> 80/TCP TargetPort: 8001/TCP Endpoints: 10.244.0.11:8001,10.244.0.12:8001,10.244.0.13:8001 Session Affinity: None
As shown above, the IP address of Pod is - 10.244.0.11:8001, 10.244.0.12:8001, 10.244.0.13:8001. If a client invokes a service on port 80, it will load balance the invocation across endpoints (the IP address of the Pod). But for gRPC, this is not the case, as you will soon see.
Create Client Deployment
To create a client deployment, save the following code in a YAML file, such as deployment-client.yaml, and then run the command kubectl apply -f deployment-client.yaml
apiVersion: apps/v1 kind: Deployment metadata: name: grpc-client labels: app: grpc-client spec: replicas: 1 selector: matchLabels: app: grpc-client template: metadata: labels: app: grpc-client spec: containers: - name: grpc-client image: techdozo/grpc-lb-client:1.0.0 env: - name: SERVER_HOST value: grpc-server-service:80
gRPC client applications use a channel to make 1000000 calls to the server in 10 concurrent threads at startup. This server_ The host environment variable points to the gRPC server service of the DNS service. On the gRPC client, by setting SERVER_HOST(serverHost) is passed as:
ManagedChannelBuilder.forTarget(serverHost) .defaultLoadBalancingPolicy("round_robin") .usePlaintext() .build();
If you check the server log, you will notice that all client calls are serviced by only one server pod.
Client load balancing using headless services
The Kubernetes headless service can be used for client-side cyclic load balancing. This simple load balancing is out of the box with gRPC. The disadvantage is that it does not consider the load on the server.
What is headless service?
Fortunately, Kubernetes allows clients to discover pod IP through DNS lookup. Usually, when you perform DNS lookup on the service, the DNS server will return an IP - the cluster IP of the service. However, if you tell Kubernetes that your service does not need cluster IP (you can do this by setting the cluster IP field in the service specification to None), the DNS server will return pod IP instead of A single service IP. The DNS server will return multiple A records of the service instead of A single DNS A record. Each record points to the IP of A single pod that supports the service at that time. Therefore, the client can perform simple DNS A record lookup and obtain the IP of all pods belonging to the service. The client can then use this information to connect to one, more, or all of them.
Setting the clusterIP field in the service specification to None will make the service headless, because Kubernetes will not assign cluster IP to it, and the client can connect to the pod supporting it through this IP.
Define headless services as:
apiVersion: v1 kind: Service metadata: name: grpc-server-service spec: clusterIP: None selector: app: grpc-server ports: - port: 80 targetPort: 8001
To make the service headless, the only. spec.clusterIP field you need to change is to set the field to None.
To confirm the DNS of the headless service, create a pod mirrored as tutum/dnsutils:
kubectl run dnsutils --image=tutum/dnsutils --command -- sleep infinity
Then run the command
kubectl exec dnsutils -- nslookup grpc-server-service
This returns the FQDN of the headless service as:
Server: 10.96.0.10 Address: 10.96.0.10#53 Name: grpc-server-service.default.svc.cluster.local Address: 10.244.0.22 Name: grpc-server-service.default.svc.cluster.local Address: 10.244.0.20 Name: grpc-server-service.default.svc.cluster.local Address: 10.244.0.21
As you can see, the headless service resolves to the IP addresses of all pod s connected through the service. Compare this with the output returned by a non headless service.
Server: 10.96.0.10 Address: 10.96.0.10#53 Name: grpc-server-service.default.svc.cluster.local Address: 10.96.158.232
The only remaining change in configuring the client application is to point to the headless service with the server pods port, as shown below:
apiVersion: apps/v1 kind: Deployment metadata: name: grpc-client labels: app: grpc-client spec: replicas: 1 selector: matchLabels: app: grpc-client template: metadata: labels: app: grpc-client spec: containers: - name: grpc-client image: techdozo/grpc-lb-client:1.0.0 env: - name: SERVER_HOST value: grpc-server-service:8001
Note that SERVER_HOST now points to the headless service grpc server service and server port 8001. You can also use SERVER_HOST as FQDN:
name: SERVER_HOST value: "grpc-server-service.default.svc.cluster.local:8001"
If you deploy the client again by deleting the Client Deployment first:
kubectl delete deployment.apps/grpc-client
Then deploy the client again:
kubectl apply -f deployment-client.yaml
You can see the log printed by pod.
The working code examples for this article are listed on GitHub. You can use kind to run code on a local Kubernetes cluster.
There are two available load balancing options in gRPC - agents and clients. Since gRPC connections are long-standing, kubernetes' default connection level load balancing is not applicable to gRPC. Kubernetes Headless service is a mechanism for load balancing. Kubernetes Headless Service DNS resolves to an IP that supports Pod.
 gRPC load balancing: https://grpc.io/blog/grpc-load-balancing/ .
 gRPC load balancing on kubernetes: https://kubernetes.io/blog/2018/11/07/grpc-load-balancing-on-kubernetes-without-tears/ .