Introduction to Kubernetes

Author Fan Che
Source: Ali technical official account


This article is an introduction to kubernetes (replaced by k8s below). It will cover the architecture of k8s, cluster construction, an example of Redis, and a tutorial on how to develop an operator using operator SDK. In the process of the article, the concepts of Pod, Deployment, stateful set, etc. will be introduced. These concepts are introduced through examples, which is easier to understand and practice. The article refers to many blogs and materials, which are put in the resources section at the end.

One k8s architecture

Let's take a look at the k8s cluster architecture. From left to right, it is divided into two parts. The first part is the Master Node (that is, the Control Plane in the figure) and the second part is the Node node.

The Master node generally consists of four components: apiserver, scheduler, controller manager and etcd. What are their respective roles

  • Apiserver: it knows astronomy and geography, connects other components, and connects ETCD. It provides various api processing, authentication, and communication with kubelet on Node. Only apiserver will connect ETCD.
  • Controller Manager: controls various controllers and is committed to changing the current state to the desired state through the controller mode.
  • Scheduler: scheduling, scoring and resource allocation.
  • Etcd: the database of the whole cluster can also be built separately without being deployed in the Master node.

Node nodes generally include three components, docker, Kube proxy and kubelet

  • Docker: the carrier of specific applications.
  • Kube proxy: mainly responsible for opening up the network. In the early stage, iptables was used, and now ipvs technology is used.
  • Kubelet: agent, responsible for managing the life cycle of the container.

To sum up, k8s cluster is an architecture composed of two components: Master and Node nodes. The Master Node is the brain of the whole cluster, and the Node node runs the Master Node scheduling application. We will explain the interaction process of these components with a specific scheduling example later.

II. Build k8s clusters

Having finished the components in the k8s cluster, let's first look at how to build a k8s cluster. There are the following methods (refer to the link at the end of the article):

  • After we install Docker Desktop APP, check k8s support to build it.
  • Use MiniKube to build a one click installation script provided by the community.
  • Purchase directly on the cloud platform, such as Alibaba cloud ack.
  • kubeadmin is a tool recommended by the k8s community to deploy production level k8s.
  • Use binary to download and install components. This tutorial needs to pay attention to that the version of each component downloaded should be consistent with that in the blog.

The following examples in this article are built with the local Docker Desktop APP k8s.

➜  ~ kubectl version
Client Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.4", GitCommit:"3cce4a82b44f032d0cd1a1790e6d2f5a55d20aae", GitTreeState:"clean", BuildDate:"2021-08-11T18:16:05Z", GoVersion:"go1.16.7", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.4", GitCommit:"3cce4a82b44f032d0cd1a1790e6d2f5a55d20aae", GitTreeState:"clean", BuildDate:"2021-08-11T18:10:22Z", GoVersion:"go1.16.7", Compiler:"gc", Platform:"linux/amd64"}

Third, starting from demand

Let's take a look at how to deploy Redis services on k8s top from a practical requirement.

  • Deploy a Redis service
  • Support high availability
  • Provide a unified EndPoint access address

1 deploy stand-alone

If we want to deploy a stand-alone version of Redis on k8s, we can execute the following command:

➜  ~ kubectl run redis --image=redis
pod/redis created
➜  ~ kubectl get pods
redis   1/1     Running   0          5s

You can use kubectl exec to enter the internal connection of Pod and Redis to execute the command:

➜  ~ kubectl exec -it redis -- bash
root@redis:/data# redis-cli> ping

So what is the relationship between Pod and redis? Redis here is actually a service started by the Docker process, but in k8s, it is called Pod.

2 pod and Deployment

Let's talk about the next k8s concept pod. Pod is the smallest scheduling unit in k8s. A pod can contain multiple dockers. These dockers will be scheduled to the same Node. These dockers share NetWork Namespace and can declare to share the same Volume to share disk space.

What are the benefits? In fact, in the real world, many applications need to be deployed on the same machine. For example, if the Redis log collection plug-in wants to collect logs, it must be deployed on the same machine as Redis to read Redis logs. When we talked about the background earlier, we mentioned that Docker Swarm has some problems, one of which is that it is only based on Docker scheduling, Although the affinity can also be set to schedule two dockers on the same machine, because they cannot be scheduled together, one Docker will be scheduled to a machine with few resources in advance, resulting in the failure of the second Docker scheduling.

For example, we have two containers, A and B, Redis and log collection components, which need 2g memory respectively. Now we have two nodes, node1 3.5 memory and node2 4g memory. Under the scheduling policy of Docker Swarm, Redis is scheduled first, which may be scheduled to node1. Then we schedule the log collection components. It is found that node1 has only 1.5g memory, Scheduling failed. However, in k8s, scheduling is scheduled according to pod. When two components are in A pod, scheduling will not consider node1.

Although the Pod can run the Redis service, it does not have high availability, because once a Pod is bound to a Node, it will never leave the Node unless the binding changes (the pod.spec.node field is modified), which means that if the host goes down, This Pod will not actively migrate to other nodes. In order to keep the service running, you need to use a controller such as Deployment.

➜  ~ kubectl create deployment redis-deployment --image=redis
deployment.apps/redis-deployment created
➜  ~ kubectl get pods
NAME                                READY   STATUS    RESTARTS   AGE
redis                               1/1     Running   0          32m
redis-deployment-866c4c6cf9-8z8k5   1/1     Running   0          8s
➜  ~

redis-deployment-866c4c6cf9-8z8k5 is the new Deployment just created through kubectl create. In order to verify high availability, we use kubectl delete pod to delete both redis and redis-deployment-866c4c6cf9-8z8k5 to see what happens.

➜  ~ kubectl delete pod redis redis-deployment-866c4c6cf9-8z8k5
pod "redis" deleted
pod "redis-deployment-866c4c6cf9-8z8k5" deleted
➜  ~ kubectl get pods
NAME                                READY   STATUS    RESTARTS   AGE
redis-deployment-866c4c6cf9-zskkb   1/1     Running   0          10s
➜  ~

Redis has disappeared, but redis-deployment-866c4c6cf9-zskkb changed its name and appeared again!

Deployment can define multiple replica pods to provide migration capabilities for applications. If you simply use the Pod, in fact, when the application is scheduled to a machine, the machine goes down and the application cannot migrate automatically. However, if you use deployment, you will call ReplicaSet (a controller) to ensure that the number of application replicas in the current cluster is consistent with the specified number.

3 k8s use yaml to describe commands

In k8s, you can use kubectl to create simple services, but another way to create complex services is to provide yaml files. For example, the above command to create a Pod can be replaced by the following yaml file. After executing kubectl create, you can see that redis Pod has been created again.

➜  ~ cat pod.yaml
apiVersion: v1
kind: Pod
  name: redis
  - name: redis
    image: redis
➜  ~ kubectl create -f pod.yaml
pod/redis created
➜  ~ kubectl get pods
NAME                                READY   STATUS    RESTARTS   AGE
redis                               1/1     Running   0          6s
redis-deployment-866c4c6cf9-zskkb   1/1     Running   0          6m32s

Four k8s component calling process

Let's take a look at what the k8s cluster does after kubectl create deployment redis deployment -- Image = redis is distributed.

  • Firstly, the controller manager, scheduler, kubelet and apiserver will start the list Watch model. List is to get the current state and Watch is to get the expected state. Then k8s cluster will strive to make the current state reach the expected state.
  • kubectl sends a command to apiserver. After authentication, the creation information is stored in etcd. The Deployment is implemented by using the ReplicaSet controller. When the controller manager gets the current state in advance (pod=0), and then receives the desired state, it needs to create a ReplicaSet (pod=1), it will start to create a Pod.
  • Then the scheduler will schedule and confirm which Node the Pod is created on.
  • After that, the kubelet on the Node really pulls up a docker.

In these steps, the role of apiserver is self-evident, so it is connected to other components and ETCD. However, apiserver can be expanded horizontally, and then through load balancing, ETCD has become a bottleneck in the k8s architecture.

When I first looked at the architecture, I thought about why apiserver, scheduler and controller manager do not compose a component. In fact, borgmaster is like this in Google Borg, and its functions are also these functions, but together, they also found that borgmaster will have some performance problems after the cluster is large, including the heartbeat of kubelet, So k8s from the beginning, open source, three components in the design are better to maintain the code.

V. deploy master-slave version

We have deployed the stand-alone version of Redis and realized the continuous operation of the service through Deployment. Next, let's see how to deploy the master-slave version. One of the difficulties is how to determine the master-slave synchronization relationship.

1 StatefulSet

k8s designs a stateful set controller for stateful applications, which mainly serves stateful applications through the following two features:

  • Topology status: instances are created sequentially and numbered according to name index, such as redis-0, redis-1, etc.
  • Storage status: you can save the data by declaring the use of external storage, such as cloud disk, so that the data in cloud disk can be read by Pod restart and rescheduling.

Let's take an example of Redis's stateful set:

apiVersion: apps/v1
kind: StatefulSet  # Type is statefullset
  name: redis-sfs  # app name
  serviceName: redis-sfs  # The service here is explained below
  replicas: 2      # Two copies are defined
      app: redis-sfs
        app: redis-sfs
      - name: redis-sfs 
        image: redis  # Mirror version
          - bash
          - "-c"
          - |
            set -ex
            ordinal=`hostname | awk -F '-' '{print $NF}'`   # Get sequence using hostname
            if [[ $ordinal -eq 0 ]]; then     # If 0, as the primary
              echo > /tmp/redis.conf
              echo "slaveof redis-sfs-0.redis-sfs 6379" > /tmp/redis.conf # If 1, as standby
            redis-server /tmp/redis.conf

Then start the stateful set and find that redis-sfs-0 and redis-sfs-1 pod s appear. They are officially numbered according to the rule of name index

➜  ~ kubectl create -f server.yaml
statefulset.apps/redis-sfs created
➜  ~ kubectl get pods
NAME                                READY   STATUS    RESTARTS   AGE
redis                               1/1     Running   0          65m
redis-deployment-866c4c6cf9-zskkb   1/1     Running   0          71m
redis-sfs-0                         1/1     Running   0          33s  # according to 
redis-sfs-1                         1/1     Running   0          28s

Then, we continue to check whether the master-slave relationship is effective. After checking the log of redis-sfs-1, we find that:

➜  ~ kubectl logs -f redis-sfs-1
1:S 05 Nov 2021 08:02:44.243 * Connecting to MASTER redis-sfs-0.redis-sfs:6379
1:S 05 Nov 2021 08:02:50.287 # Unable to connect to MASTER: Resource temporarily unavailable

2 Headless Service

It seems that redis-sfs-1 doesn't know redis-sfs-0 because we haven't let them know each other. This mutual understanding requires k8s a Service called Headless Service, which is a mechanism used in the k8s project to expose a group of pods to the outside world. For example, if a Deployment has three pods, I can define a Service. Then, as long as users can access the Service, they can access a specific Pod in two ways:

  • VIP: access VIP and randomly return a back-end Pod
  • DNS: resolve to a Pod on the back end through DNS

Headless Service can resolve the address of a Pod through DNS. The rule of this DNS address is:

Let's create the Headless Service corresponding to the cluster:

apiVersion: v1
kind: Service
  name: redis-sfs
    app: redis-sfs
  clusterIP: None   # None here means Headless, which means that it will be actively allocated by k8s
    - port: 6379
      name: redis-sfs
    app: redis-sfs

After checking again, it is found that redis-sfs-1 has successfully synchronized the active and standby, because redis-sfs-0.redis-sfs.default.svc.cluster.local is the only accessible in the cluster after the Headless Service is created.

➜  ~ kubectl create -f service.yaml
service/redis-sfs created
➜  ~ kubectl get service
kubernetes   ClusterIP    <none>        443/TCP    24d
redis-sfs    ClusterIP   None         <none>        6379/TCP   33s
➜  ~ kubectl logs -f redis-sfs-1
1:S 05 Nov 2021 08:23:31.341 * Connecting to MASTER redis-sfs-0.redis-sfs:6379
1:S 05 Nov 2021 08:23:31.345 * MASTER <-> REPLICA sync started
1:S 05 Nov 2021 08:23:31.345 * Non blocking connect for SYNC fired the event.
1:S 05 Nov 2021 08:23:31.346 * Master replied to PING, replication can continue...
1:S 05 Nov 2021 08:23:31.346 * Partial resynchronization not possible (no cached master)
1:S 05 Nov 2021 08:23:31.348 * Full resync from master: 29d1c03da6ee2af173b8dffbb85b6ad504ccc28f:0
1:S 05 Nov 2021 08:23:31.425 * MASTER <-> REPLICA sync: receiving 175 bytes from master to disk
1:S 05 Nov 2021 08:23:31.426 * MASTER <-> REPLICA sync: Flushing old data
1:S 05 Nov 2021 08:23:31.426 * MASTER <-> REPLICA sync: Loading DB in memory
1:S 05 Nov 2021 08:23:31.431 * Loading RDB produced by version 6.2.6
1:S 05 Nov 2021 08:23:31.431 * RDB age 0 seconds
1:S 05 Nov 2021 08:23:31.431 * RDB memory usage when created 1.83 Mb
1:S 05 Nov 2021 08:23:31.431 # Done loading RDB, keys loaded: 0, keys expired: 0.
1:S 05 Nov 2021 08:23:31.431 * MASTER <-> REPLICA sync: Finished with success
➜  ~ kubectl exec -it redis-sfs-1 -- bash
root@redis-sfs-1:/data# redis-cli -h redis-sfs-0.redis-sfs.default.svc.cluster.local
redis-sfs-0.redis-sfs.default.svc.cluster.local:6379> ping

At this time, no matter which Pod we delete, it will be pulled up according to the original name, so as to ensure the preparation relationship. This example is only an example of StatefulSet. Through analysis, it can be found that although it can maintain the master-slave relationship, when the master is hung, the standby cannot be switched, because no component can help us do this switching operation, One way is to use Redis Sentinel. You can refer to the configuration of this project: k8s redis ha master. If your k8s is newer, you need to merge this PR

Vi. Operator

Although StatefulSet is available, it can only be useful for the basic version. If you want to customize more complex operations, the k8s solution is operator. In short, operator is the solution to customize your k8s object and the operation corresponding to the object.

What is the object? A Redis cluster, an etcd cluster and a zk cluster can all be an object. In reality, we define what we want to describe. In fact, we set a kind in k8s yaml. In the previous example, we used Pod, Deployment and StatefulSet, which are the default implementations of k8s. Now if we want to define our own objects, there are two processes:

  • Define objects. For example, how many nodes are there in your cluster by default, and what components are there
  • Define the operation triggered by the object, what process to do when creating the object, what process to do when HA, etc

The operator method is based on programming and can be implemented in many languages. The most used language is go language. Usually, we will complete it with the help of operator SDK, because a lot of code will be generated automatically. It is equivalent that the operator will generate the framework, and then we will implement the corresponding business logic.

1 Preparation

  • Install go environment
  • Install operator SDK

2 initialization item

Then we implement a memcached operator step by step according to the sdk example on the official website. Redis can also be used here. However, in order to ensure consistency with the official website, we create a memcached operator according to the official website.

➜  ~ cd $GOPATH/src
➜  src mkdir memcached-operator
➜  src cd memcached-operator
➜  memcached-operator operator-sdk init --domain yangbodong22011 --repo --skip-go-version-check // It should be noted that domain is better to be with you  The registered name of is the same, because the docker image will be published later
Writing kustomize manifests for you to edit...
Writing scaffold for you to edit...
Get controller runtime:
$ go get
Update dependencies:
$ go mod tidy
Next: define a resource with:
$ operator-sdk create api

3 Create API and Controller

➜  memcached-operator operator-sdk create api --group cache --version v1alpha1 --kind Memcached --resource --controller
Writing kustomize manifests for you to edit...
Writing scaffold for you to edit...
Update dependencies:
$ go mod tidy
Running make:
$ make generate
go: creating new go.mod: module tmp
go get: installing executables with 'go get' in module mode is deprecated.
    To adjust and download dependencies of the current module, use 'go get -d'.
    To install using requirements of the current module, use 'go install'.
    To install ignoring the current module, use 'go install' with a version,
    like 'go install'.
    For more information, see
    or run 'go help get' or 'go help install'.
go get: added v1.2.0
/Users/yangbodong/go/src/memcached-operator/bin/controller-gen object:headerFile="hack/boilerplate.go.txt" paths="./..."
➜  memcached-operator

The above steps actually generate a framework for operator. Next we first define what memcached clusters include, modify the default implementation to Size, represent the number of Memcached in a Memcached cluster, and finally call make generate and make manifests to generate deepcopy and CRD resources.

➜  memcached-operator vim api/v1alpha1/memcached_types.go // Modify the following definition of Memcached cluster
// MemcachedSpec defines the desired state of Memcached
type MemcachedSpec struct {
    // Size is the size of the memcached deployment
    Size int32 `json:"size"`

// MemcachedStatus defines the observed state of Memcached
type MemcachedStatus struct {
    // Nodes are the names of the memcached pods
    Nodes []string `json:"nodes"`

➜  memcached-operator make generate
/Users/yangbodong/go/src/memcached-operator/bin/controller-gen object:headerFile="hack/boilerplate.go.txt" paths="./..."
➜  memcached-operator make manifests
/Users/yangbodong/go/src/memcached-operator/bin/controller-gen "crd:trivialVersions=true,preserveUnknownFields=false" rbac:roleName=manager-role webhook paths="./..." output:crd:artifacts:config=config/crd/bases
➜  memcached-operator

4. Implement Controller

The second step is to define what to do when creating a Memcached cluster.

➜  memcached-operator vim controllers/memcached_controller.go //
take example change into yangbodong22011,be careful,// In fact, it is not a comment, but a format

➜  memcached-operator go mod tidy; make manifests
/Users/yangbodong/go/src/memcached-operator/bin/controller-gen "crd:trivialVersions=true,preserveUnknownFields=false" rbac:roleName=manager-role webhook paths="./..." output:crd:artifacts:config=config/crd/bases

5. Publish operator Image

➜  memcached-operator vim Makefile
 take -IMG ?= controller:latest Change to +IMG ?= $(IMAGE_TAG_BASE):$(VERSION)

➜  memcached-operator docker login  // Log in to docker in advance
Login with your Docker ID to push and pull images from Docker Hub. If you don't have a Docker ID, head over to to create one.
Username: yangbodong22011
WARNING! Your password will be stored unencrypted in /Users/yangbodong/.docker/config.json.
Configure a credential helper to remove this warning. See

Login Succeeded
➜  memcached-operator sudo make docker-build docker-push 
=> => writing image sha256:a7313209e321c84368c5cb7ec820fffcec2d6fcb510219d2b41e3b92a2d5545a                                                             0.0s
 => => naming to                                                                                      0.0s
fac03a24e25a: Pushed
6d75f23be3dd: Pushed
0.0.1: digest: sha256:242380214f997d98186df8acb9c13db12f61e8d0f921ed507d7087ca4b67ce59 size: 739

6. Modify image and deployment

➜  memcached-operator vim config/manager/manager.yaml
image: controller:latest Change to yangbodong22011/memcached-operator:0.0.1

➜  memcached-operator vim config/default/manager_auth_proxy_patch.yaml
 Because I can't visit at home
image: Change to kubesphere/kube-rbac-proxy:v0.8.0 

➜  memcached-operator make deploy
configmap/memcached-operator-manager-config created
service/memcached-operator-controller-manager-metrics-service created
deployment.apps/memcached-operator-controller-manager created

➜  memcached-operator kubectl get deployment -n memcached-operator-system // ready indicates that the operator has been deployed
NAME                                    READY   UP-TO-DATE   AVAILABLE   AGE
memcached-operator-controller-manager   1/1     1            1           31s
➜  memcached-operator

7 create a Memcached cluster

➜  memcached-operator cat config/samples/cache_v1alpha1_memcached.yaml
apiVersion: cache.yangbodong22011/v1alpha1
kind: Memcached
  name: memcached-sample
  size: 1
➜  memcached-operator kubectl apply -f config/samples/cache_v1alpha1_memcached.yaml
memcached.cache.yangbodong22011/memcached-sample created
➜  memcached-operator kubectl get pods
NAME                                READY   STATUS    RESTARTS   AGE
memcached-sample-6c765df685-xhhjc   1/1     Running   0          104s
redis                               1/1     Running   0          177m
redis-deployment-866c4c6cf9-zskkb   1/1     Running   0          3h4m
redis-sfs-0                         1/1     Running   0          112m
redis-sfs-1                         1/1     Running   0          112m
➜  memcached-operator

You can view the operator logs through kubectl logs:

➜  ~ kubectl logs -f deployment/memcached-operator-controller-manager -n memcached-operator-system
2021-11-05T09:50:46.042Z    INFO    controller-runtime.manager.controller.memcached    Creating a new Deployment    {"reconciler group": "cache.yangbodong22011", "reconciler kind": "Memcached", "name": "memcached-sample", "namespace": "default", "Deployment.Namespace": "default", "Deployment.Name": "memcached-sample"}

So far, the task of our operator SDK has come to an end temporarily.

VII. Summary

This paper introduces the architecture of k8s and the functions of various components, introduces the concepts of pod, deployment and stateful set in k8s through a step-by-step Redis example, and demonstrates a complete example of operator production through operator SDK.

VIII. References

[1] In depth analysis of Kubernetes Zhang Lei, CNCF TOC member, at Alibaba.
[2] Kubernetes authoritative guide, Fifth Edition
[3] <Large-scale cluster management at Google with Borg>

In 2021, Alibaba cloud open source image station's most popular image Wang Quan combs!

The original intention of Alibaba cloud open source mirror station is to publicize the value of free software, improve everyone's development efficiency and help everyone create applications faster.

click here , view details!

Tags: Go Docker Kubernetes Redis api

Posted on Wed, 17 Nov 2021 14:29:46 -0500 by RandomZero