Deploying multiple vk virtual nodes in a kubernetes cluster

Use vk virtual nodes to increase k8s cluster capacity and resilience

The way of adding vk virtual node in kubernetes cluster has been widely used by many customers. The virtual node based on vk can greatly improve the capacity and flexibility of the cluster, create ECI Pod flexibly and dynamically on demand, and avoid the trouble of cluster capacity planning.At present, vk virtual nodes are widely used in the following scenarios.

  • Peak and valley elastic demand of online business: For example, online education, e-commerce and other industries have clear peak and valley computing characteristics, using vk can significantly reduce the maintenance of fixed resource pools and reduce computing costs.
  • Promote cluster Pod capacity: When traditional flannel network mode clusters are unable to add more nodes due to vpc routing table entries or vswitch network planning restrictions, using virtual nodes can avoid the above problems and simply and quickly increase cluster Pod capacity.
  • Data Computing: Use vk to host computing scenarios such as Spark, Presto, etc. to effectively reduce computing costs.
  • CI/CD and other Job-type tasks

Create multiple vk virtual nodes

Deploy virtual nodes with reference to ACK product documentation: https://help.aliyun.com/document_detail/118970.html

Generally speaking, if the number of ECI pods in a single k8s cluster is less than 3000, we recommend deploying a single vk node.For scenarios where you want vk to carry more pods, we recommend deploying multiple vk nodes in the k8s cluster to expand vk horizontally. The deployment pattern of multiple vk nodes can alleviate the pressure of a single vk node and support a larger eci pod capacity.For example, 3 vk nodes can support 33000 ECI pods and 10 vk nodes can support 103000 ECI pods.

For simpler vk level extensions, we deploy vk controllers using statefulset, where each vk controller manages a vk node.When more vk virtual nodes are needed, simply modify the replicas of the statefulset.Configure and deploy the following statefulset yaml file (configuring AK and information such as vpc/vswitch/security group) with a default number of 1 copies of statefulset.

apiVersion: apps/v1
kind: StatefulSet
metadata:
  labels:
    app: virtual-kubelet
  name: virtual-kubelet
  namespace: kube-system
spec:
  replicas: 1
  selector:
    matchLabels:
      app: virtual-kubelet
  serviceName: ""
  template:
    metadata:
      labels:
        app: virtual-kubelet
    spec:
      containers:
      - args:
        - --provider
        - alibabacloud
        - --nodename
        - $(VK_INSTANCE)
        env:
        - name: VK_INSTANCE
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: metadata.name
        - name: KUBELET_PORT
          value: "10250"
        - name: VKUBELET_POD_IP
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: status.podIP
        - name: VKUBELET_TAINT_KEY
          value: "virtual-kubelet.io/provider"
        - name: VKUBELET_TAINT_VALUE
          value: "alibabacloud"
        - name: VKUBELET_TAINT_EFFECT
          value: "NoSchedule"
        - name: ECI_REGION
          value: xxx
        - name: ECI_VPC
          value: vpc-xxx
        - name: ECI_VSWITCH
          value: vsw-xxx
        - name: ECI_SECURITY_GROUP
          value: sg-xxx
        - name: ECI_QUOTA_CPU
          value: "1000000"
        - name: ECI_QUOTA_MEMORY
          value: 6400Ti
        - name: ECI_QUOTA_POD
          value: "3000"
        - name: ECI_ACCESS_KEY
          value: xxx
        - name: ECI_SECRET_KEY
          value: xxx
        - name: ALIYUN_CLUSTERID
          value: xxx
        image: registry.cn-hangzhou.aliyuncs.com/acs/virtual-nodes-eci:v1.0.0.2-aliyun
        imagePullPolicy: Always
        name: ack-virtual-kubelet
      dnsPolicy: ClusterFirst
      restartPolicy: Always
      schedulerName: default-scheduler
      serviceAccount: ack-virtual-node-controller
      serviceAccountName: ack-virtual-node-controller

Modify the number of pod copies of statefulset to add more vk nodes.

# kubectl -n kube-system scale statefulset virtual-kubelet --replicas=3
statefulset.apps/virtual-kubelet scaled

# kubectl get no
NAME                            STATUS     ROLES    AGE     VERSION
cn-hangzhou.192.168.1.1         Ready      <none>   63d     v1.12.6-aliyun.1
cn-hangzhou.192.168.1.2         Ready      <none>   63d     v1.12.6-aliyun.1
virtual-kubelet-0               Ready      agent     1m     v1.11.2-aliyun-1.0.207
virtual-kubelet-1               Ready      agent     1m   v1.11.2-aliyun-1.0.207
virtual-kubelet-2               Ready      agent     1m   v1.11.2-aliyun-1.0.207

# kubectl -n kube-system get statefulset virtual-kubelet
NAME              READY   AGE
virtual-kubelet   3/3     3m

# kubectl -n kube-system get pod|grep virtual-kubelet
virtual-kubelet-0                                                1/1     Running   0          15m
virtual-kubelet-1                                                1/1     Running   0          11m
virtual-kubelet-2                                                1/1     Running   0 

When we create multiple nginx pod s in the vk namespace, we can see that pods are dispatched to multiple vk nodes.

# kubectl create ns vk
# kubectl label namespace vk virtual-node-affinity-injection=enabled

# kubectl -n vk run nginx --image nginx:alpine --replicas=10
deployment.extensions/nginx scaled

# kubectl -n vk get pod -o wide
NAME                     READY   STATUS    RESTARTS   AGE   IP                NODE                NOMINATED NODE   READINESS GATES
nginx-544b559c9b-4vgzx   1/1     Running   0           1m   192.168.165.198   virtual-kubelet-2   <none>           <none>
nginx-544b559c9b-544tm   1/1     Running   0           1m   192.168.125.10    virtual-kubelet-0   <none>           <none>
nginx-544b559c9b-9q7v5   1/1     Running   0           1m   192.168.165.200   virtual-kubelet-1   <none>           <none>
nginx-544b559c9b-llqmq   1/1     Running   0           1m   192.168.165.199   virtual-kubelet-2   <none>           <none>
nginx-544b559c9b-p6c5g   1/1     Running   0           1m   192.168.165.197   virtual-kubelet-0   <none>           <none>
nginx-544b559c9b-q8mpt   1/1     Running   0           1m   192.168.165.196   virtual-kubelet-0   <none>           <none>
nginx-544b559c9b-rf5sq   1/1     Running   0           1m   192.168.125.8     virtual-kubelet-0   <none>           <none>
nginx-544b559c9b-s64kc   1/1     Running   0           1m   192.168.125.11    virtual-kubelet-2   <none>           <none>
nginx-544b559c9b-vfv56   1/1     Running   0           1m   192.168.165.201   virtual-kubelet-1   <none>           <none>
nginx-544b559c9b-wfb2z   1/1     Running   0           1m   192.168.125.9     virtual-kubelet-1   <none>           <none>

Reduce the number of vk virtual nodes

Because the eci pod on the vk is created on demand, vk virtual nodes do not occupy the actual resources when there is no eci pod, so in general we do not need to reduce the number of vk nodes.However, if users really want to reduce the number of vk nodes, we recommend the following steps.

Suppose there are five vk nodes in the current cluster, virtual-kubelet-0/.../virtual-kubelet-4.We want to reduce to one vk node, so we need to delete the four virtual-kubelet-1/. / virtual-kubelet-4 nodes.

  • Gracefully offline vk nodes first, expelling the above pods to other nodes, but also prohibiting more pods from dispatching to the vk nodes to be deleted.
# kubectl drain virtual-kubelet-1 virtual-kubelet-2 virtual-kubelet-3 virtual-kubelet-4

# kubectl get no
NAME                      STATUS                     ROLES    AGE    VERSION
cn-hangzhou.192.168.1.1   Ready                      <none>   66d    v1.12.6-aliyun.1
cn-hangzhou.192.168.1.2   Ready                      <none>   66d    v1.12.6-aliyun.1
virtual-kubelet-0         Ready                      agent    3d6h   v1.11.2-aliyun-1.0.207
virtual-kubelet-1         Ready,SchedulingDisabled   agent    3d6h   v1.11.2-aliyun-1.0.207
virtual-kubelet-2         Ready,SchedulingDisabled   agent    3d6h   v1.11.2-aliyun-1.0.207
virtual-kubelet-3         Ready,SchedulingDisabled   agent    66m    v1.11.2-aliyun-1.0.207
virtual-kubelet-4         Ready,SchedulingDisabled   agent    66m    v1.11.2-aliyun-1.0.207

The reason why elegant offline vk nodes need to be taken first is that ECI pods on vk nodes are managed by vk controllers. Deleting vk controllers when ECI pods still exist on vk nodes will result in residual ECI pods, and vk controllers cannot continue to manage those pods.

  • After the vk nodes are offline, modify the number of copies of the virtual-kubelet statefulset to reduce it to the number of vk nodes we expect.
# kubectl -n kube-system scale statefulset virtual-kubelet --replicas=1
statefulset.apps/virtual-kubelet scaled

# kubectl -n kube-system get pod|grep virtual-kubelet
virtual-kubelet-0                                                1/1     Running   0          3d6h

Wait a while and we will see that those vk nodes become NotReady.

# kubectl get no
NAME                      STATUS                        ROLES    AGE    VERSION
cn-hangzhou.192.168.1.1   Ready                         <none>   66d    v1.12.6-aliyun.1
cn-hangzhou.192.168.1.2   Ready                         <none>   66d    v1.12.6-aliyun.1
virtual-kubelet-0         Ready                         agent    3d6h   v1.11.2-aliyun-1.0.207
virtual-kubelet-1         NotReady,SchedulingDisabled   agent    3d6h   v1.11.2-aliyun-1.0.207
virtual-kubelet-2         NotReady,SchedulingDisabled   agent    3d6h   v1.11.2-aliyun-1.0.207
virtual-kubelet-3         NotReady,SchedulingDisabled   agent    70m    v1.11.2-aliyun-1.0.207
virtual-kubelet-4         NotReady,SchedulingDisabled   agent    70m    v1.11.2-aliyun-1.0.207
  • Manually delete vk nodes in NotReady state
# kubelet delete no virtual-kubelet-1 virtual-kubelet-2 virtual-kubelet-3 virtual-kubelet-4
node "virtual-kubelet-1" deleted
node "virtual-kubelet-2" deleted
node "virtual-kubelet-3" deleted
node "virtual-kubelet-4" deleted
# kubectl get no
NAME                      STATUS     ROLES    AGE    VERSION
cn-hangzhou.192.168.1.1   Ready      <none>   66d    v1.12.6-aliyun.1
cn-hangzhou.192.168.1.2   Ready      <none>   66d    v1.12.6-aliyun.1
virtual-kubelet-0         Ready      agent    3d6h   v1.11.2-aliyun-1.0.207

Tags: Web Server kubelet Nginx VPC network

Posted on Sun, 09 Feb 2020 20:35:11 -0500 by drag0n