Deploying MySQL clusters in Kubernetes

Article reprinted from: https://www.cnblogs.com/ludongguoa/p/15319861.html

Generally, Kubernetes can create multiple pod replicas with one pod template through ReplicaSet, but they are stateless and can be replaced by a new pod at any time. However, a stateful pod needs another scheme to ensure that when a stateful pod hangs, the pod instance needs to be rebuilt on other nodes, but the new instance must have the same name, network ID and state as the replaced instance. This is how Statefulset manages pods.

For container clusters, the challenge of stateful services is that usually any node in the cluster is not 100% reliable, and the resources required for services will be updated and changed dynamically. When the node fails or the service cannot continue to run on the original node due to the need for more resources, the cluster management system will reassign a new running location for the service, so as to ensure that the external services of the cluster will not be interrupted as a whole. If local storage is adopted, when the service drifts, the data will not be transferred to a new node with the service, and the dilemma of data loss will occur when the service is restarted.

The purpose of this paper is to deeply understand the statful set management of kubernetes through the construction of a mysql master-slave cluster. In order to reduce the external dependence of the experiment, at the storage level, I use local storage. Of course, this is not recommended in production. For the storage of production environment, the storage schemes such as gce, nfs and ceph officially introduced are recommended, because these schemes support the characteristics of dynamic supply and allow developers to quickly realize effective data storage through the definition of pvc, Therefore, you should never use the directory on a host as a PV, but this article is used for experimental needs, using the means of Local Persistent Volume, in order to verify the state management function of Statefulset.

Experimental environment

  • kubernetes Master
  • kubernetes Node (test demonstration, all copies will run on it)
  • kubernetes DNS service is turned on

Experimental purpose

  • Build a MySQL Cluster with Master-Slave replication
  • The slave node can be expanded horizontally
  • All write operations can only be performed on the primary node
  • Read operations can be performed on master and slave nodes
  • The slave node can synchronize the data of the master node

Principle of local storage

In order to quickly build the test environment, we use local storage here, that is, users want Kubernetes to directly use the local disk directory on the host without relying on the remote storage service to provide a persistent container Volume. However, there is a difficulty here: we fix the storage on one node, but the pod floats around when scheduling. How can we fix the pod on the pv through pvc?

Can you add a nodeAffinity to this Pod?

Of course, but this destroys the developer's definition of resource objects in disguise. Developers should not need to consider the details of scheduling at all times. Scheduling changes should be handed over to the operation and maintenance department. Therefore, in order to realize local storage, we adopt the method of delayed binding. The method is very simple. We all know that the storage class is generally designed by the operation and maintenance personnel. We only need to specify no provider in the storage class. This is because the Local Persistent Volume does not support Dynamic Provisioning at present, so it cannot automatically create the corresponding PV when the user creates a PVC. At the same time, this storageclass also defines a property of volumabindingmode = waitforfirstconsumer. It is a very important feature of Local Persistent Volume, that is, delayed binding

kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: local-storage
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: WaitForFirstConsumer

Experimental steps

1, First, allocate several PVS on the node (the node IP used for the experiment is 172.31.170.51) (this is not recommended for production)

01-persistentVolume-1.yaml

apiVersion: v1
kind: PersistentVolume
metadata:
  name: example-mysql-pv
spec:
  capacity:
    storage: 15Gi
  volumeMode: Filesystem
  accessModes:
  - ReadWriteOnce
  persistentVolumeReclaimPolicy: Delete
  storageClassName: local-storage
  local:
    path: /data/svr/projects/mysql
  nodeAffinity:
    required:
      nodeSelectorTerms:
      - matchExpressions:
        - key: kubernetes.io/hostname
          operator: In
          values:
          - 172.31.170.51

01-persistentVolume-2.yaml

apiVersion: v1
kind: PersistentVolume
metadata:
  name: example-mysql-pv-2
spec:
  capacity:
    storage: 15Gi
  volumeMode: Filesystem
  accessModes:
  - ReadWriteOnce
  persistentVolumeReclaimPolicy: Delete
  storageClassName: local-storage
  local:
    path: /data/svr/projects/mysql2
  nodeAffinity:
    required:
      nodeSelectorTerms:
      - matchExpressions:
        - key: kubernetes.io/hostname
          operator: In
          values:
          - 172.31.170.51

01-persistentVolume-3.yaml

apiVersion: v1
kind: PersistentVolume
metadata:
  name: example-mysql-pv-3
spec:
  capacity:
    storage: 15Gi
  volumeMode: Filesystem
  accessModes:
  - ReadWriteOnce
  persistentVolumeReclaimPolicy: Delete
  storageClassName: local-storage
  local:
    path: /data/svr/projects/mysql3
  nodeAffinity:
    required:
      nodeSelectorTerms:
      - matchExpressions:
        - key: kubernetes.io/hostname
          operator: In
          values:
          - 172.31.170.51

Remember, this is not recommended in production. I only create it manually in advance for experimental purposes. The formal method should use Dynamic Provisioning through storageclass rather than Static Provisioning to produce PV.

kubectl apply -f 01-persistentVolume-{1..3}.yaml

persistentvolume/example-mysql-pv1 created
persistentvolume/example-mysql-pv2 created
persistentvolume/example-mysql-pv3 created

2, Create StorageClass

02-storageclass.yaml

kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: local-storage
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: WaitForFirstConsumer

Execute create

kubectl apply -f 02-storageclass.yaml

storageclass.storage.k8s.io/local-storage created

3, Create Namespace

03-mysql-namespace.yaml

apiVersion: v1
kind: Namespace
metadata:
  name: mysql
  labels:
    app: mysql

Execute create

kubectl apply -f 03-mysql-namespace.yaml

namespace/mysql created

4, Use ConfigMap to assign different profiles to Master/Slave nodes

04-mysql-configmap.yaml

apiVersion: v1
kind: ConfigMap
metadata:
  name: mysql
  namespace: mysql
  labels:
    app: mysql
data:
  master.cnf: |
    # Master configuration
    [mysqld]
    log-bin=mysqllog
    skip-name-resolve
  slave.cnf: |
    # Slave configuration
    [mysqld]
    super-read-only
    skip-name-resolve
    log-bin=mysql-bin
    replicate-ignore-db=mysql

Create execution

kubectl apply -f 04-mysql-configmap.yaml

configmap/mysql created

5, Create mysql password Secret

05-mysql-secret.yaml

apiVersion: v1
kind: Secret
metadata:
  name: mysql-secret
  namespace: mysql
  labels:
    app: mysql
type: Opaque
data:
  password: MTIzNDU2 # echo -n "123456" | base64

Create execution

kubectl apply -f 05-mysql-secret.yaml

secret/mysql-secret created

6, Using Service to provide read-write separation for MySQL

06-mysql-services.yaml

apiVersion: v1
kind: Service
metadata:
  name: mysql
  namespace: mysql
  labels:
    app: mysql
spec:
  ports:
  - name: mysql
    port: 3306
  clusterIP: None
  selector:
    app: mysql
---
apiVersion: v1
kind: Service
metadata:
  name: mysql-read
  namespace: mysql
  labels:
    app: mysql
spec:
  ports:
  - name: mysql
    port: 3306
  selector:
    app: mysql

For all write requests, users must directly access the Master node in the form of DNS records, that is, the DNS record mysql-0.mysql.

For all read requests, users must access the automatically allocated DNS record, which can be forwarded to any Master or Slave node, that is, the MySQL read DNS record

kubectl apply -f 06-mysql-services.yaml

$ kubectl get svc -n mysql
NAME         TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)    AGE
mysql        ClusterIP   None         <none>        3306/TCP   20s
mysql-read   ClusterIP   10.0.0.63    <none>        3306/TCP   20s

7, Build MySQL master-slave cluster using StatefulSet

mysql-statefulset.yaml

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: mysql
  namespace: mysql
  labels:
    app: mysql
spec:
  selector:
    matchLabels:
      app: mysql
  serviceName: mysql
  replicas: 2
  template:
    metadata:
      labels:
        app: mysql
    spec:
      initContainers:
      - name: init-mysql
        image: mysql:5.7
        env:
        - name: MYSQL_ROOT_PASSWORD
          valueFrom:
            secretKeyRef:
              name: mysql-secret
              key: password
        command:
        - bash
        - "-c"
        - |
          set -ex
          # Generate the server ID from the serial number of the Pod
          [[ $(hostname) =~ -([-9]+)$ ]] || exit 1
          ordinal=${BASH_REMATCH[1]}
          echo [mysqld] > /mnt/conf.d/server-id.cnf
          # Since the server ID cannot be, add 100 to the ID to avoid it
          echo server-id=$((100 + $ordinal)) >> /mnt/conf.d/server-id.cnf
          # If the serial number of Pod is, it indicates that it is a Master node. Copy the Master configuration file from ConfigMap to the / mnt/conf.d directory
          # Otherwise, copy the Slave configuration file in ConfigMap
          if [[ ${ordinal} -eq  ]]; then
            cp /mnt/config-map/master.cnf /mnt/conf.d
          else
            cp /mnt/config-map/slave.cnf /mnt/conf.d
          fi
        volumeMounts:
        - name: conf
          mountPath: /mnt/conf.d
        - name: config-map
          mountPath: /mnt/config-map
      - name: clone-mysql
        image: gcr.io/google-samples/xtrabackup:1.0
        env:
        - name: MYSQL_ROOT_PASSWORD
          valueFrom:
            secretKeyRef:
              name: mysql-secret
              key: password
        command:
        - bash
        - "-c"
        - |
          set -ex
          # The copy operation only needs to be started for the first time, so if the data already exists, it will be skipped
          [[ -d /var/lib/mysql/mysql ]] && exit 
          # The Master node (serial number) does not need this operation
          [[ $(hostname) =~ -([-9]+)$ ]] || exit 1
          ordinal=${BASH_REMATCH[1]}
          [[ $ordinal ==  ]] && exit 
          # Use the ncat instruction to remotely copy data from the previous node to the local node
          ncat --recv-only mysql-$(($ordinal-1)).mysql 3307 | xbstream -x -C /var/lib/mysql
          # Execute -- prepare so that the copied data can be used for recovery
          xtrabackup --prepare --target-dir=/var/lib/mysql
        volumeMounts:
        - name: data
          mountPath: /var/lib/mysql
          subPath: mysql
        - name: conf
          mountPath: /etc/mysql/conf.d
      containers:
      - name: mysql
        image: mysql:5.7
        env:
#        - name: MYSQL_ALLOW_EMPTY_PASSWORD
#          value: "1"
        - name: MYSQL_ROOT_PASSWORD
          valueFrom:
            secretKeyRef:
              name: mysql-secret
              key: password
        ports:
        - name: mysql
          containerPort: 3306
        volumeMounts:
        - name: data
          mountPath: /var/lib/mysql
          subPath: mysql
        - name: conf
          mountPath: /etc/mysql/conf.d
        resources:
          requests:
            cpu: 500m
            memory: 1Gi
        livenessProbe:
          exec:
            command: ["mysqladmin", "ping", "-uroot", "-p${MYSQL_ROOT_PASSWORD}"]
          initialDelaySeconds: 30
          periodSeconds: 10
          timeoutSeconds: 5
        readinessProbe:
          exec:
            command: ["mysqladmin", "ping", "-uroot", "-p${MYSQL_ROOT_PASSWORD}"]
          initialDelaySeconds: 5
          periodSeconds: 2
          timeoutSeconds: 1
      - name: xtrabackup
        image: gcr.io/google-samples/xtrabackup:1.0
        ports:
        - name: xtrabackup
          containerPort: 3307
        env:
        - name: MYSQL_ROOT_PASSWORD
          valueFrom:
            secretKeyRef:
              name: mysql-secret
              key: password
        command:
        - bash
        - "-c"
        - |
          set -ex
          cd /var/lib/mysql
          # Read Master from backup information file_ LOG_ File and master_ LOG_ The values of the two POS fields are used to assemble the cluster and initialize SQL
          if [[ -f xtrabackup_slave_info ]]; then
            # If xtrabackup_ Slave_ The info file exists, indicating that the backup data comes from another Slave node
            # In this case, the XtraBackup tool has automatically generated the "CHANGE MASTER TO" SQL statement in this file during backup
            # So, just rename the file change_master_to.sql.in, which can be used directly later
            mv xtrabackup_slave_info change_master_to.sql.in
            # Therefore, there is no need for xtrabackup_binlog_info
            rm -f xtrabackup_binlog_info
          elif [[ -f xtrabackup_binlog_info ]]; then
            # If only xtrabackup exists_ binlog_ Info file, indicating that the backup comes from the Master node, you need to parse the backup information file and read the values of the two required fields
            [[ $(cat xtrabackup_binlog_info) =~ ^(.*?)[[:space:]]+(.*?)$ ]] || exit 1
            rm xtrabackup_binlog_info
            # Assemble the values of the two fields into SQL and write them to change_master_to.sql.in file
            echo "CHANGE MASTER TO MASTER_LOG_FILE='${BASH_REMATCH[1]}',\
                  MASTER_LOG_POS=${BASH_REMATCH[2]}" > change_master_to.sql.in
          fi
          # If change exists_ master_ To.sql.in, which means cluster initialization is required
          if [[ -f change_master_to.sql.in ]]; then
            # However, you must wait for the MySQL container to start before you can connect to MySQL in the next step
            echo "Waiting for mysqld to be ready(accepting connections)"
            until mysql -h 127.0.0.1 -uroot -p${MYSQL_ROOT_PASSWORD} -e "SELECT 1"; do sleep 1; done
            echo "Initializing replication from clone position"
            # Change file_ master_ Change the name of to.sql.in
            # Prevent the Container from restarting because change is found again_ master_ To.sql.in to repeat the initialization process
            mv change_master_to.sql.in change_master_to.sql.orig
            # Using change_ master_ The contents of to.sql.orig, that is, the SQL assembled earlier, form a complete SQL statement for initializing and starting Slave
            mysql -h 127.0.0.1 -uroot -p${MYSQL_ROOT_PASSWORD} << EOF
          $(< change_master_to.sql.orig),
            MASTER_HOST='mysql-0.mysql.mysql',
            MASTER_USER='root',
            MASTER_PASSWORD='${MYSQL_ROOT_PASSWORD}',
            MASTER_CONNECT_RETRY=10;
          START SLAVE;
          EOF
          fi
          # Use ncat to listen on port 3307.
          # Its function is to directly execute xtrabackup --backup command to back up MySQL data and send it to the requester when receiving the transmission request
          exec ncat --listen --keep-open --send-only --max-conns=1 3307 -c \
            "xtrabackup --backup --slave-info --stream=xbstream --host=127.0.0.1 --user=root --password=${MYSQL_ROOT_PASSWORD}"
        volumeMounts:
        - name: data
          mountPath: /var/lib/mysql
          subPath: mysql
        - name: conf
          mountPath: /etc/mysql/conf.d
      volumes:
      - name: conf
        emptyDir: {}
      - name: config-map
        configMap:
          name: mysql
  volumeClaimTemplates:
  - metadata:
      name: data
    spec:
      accessModes:
      - "ReadWriteOnce"
      storageClassName: local-storage
      resources:
        requests:
          storage: 3Gi

The overall statefulset has two replicas, a Master and a Slave. Then, init mysql, the initContainers, is used to initialize the configuration file. Then use the clone MySQL initContainers to transfer data; At the same time, xtrabackup, a sidecar container, is used for SQL initialization and data transmission

Create StatefulSet

kubectl apply -f 07-mysql-statefulset.yaml

$ kubectl get po -n mysql
NAME      READY   STATUS    RESTARTS   AGE
mysql-0   2/2     Running             70s
mysql-1   /2     Pending             5s

You can see that after StatefulSet is started successfully, two pods will run.

Next, we can try to send a request to the MySQL Cluster and perform some SQL operations to verify whether it is normal

Service validation

Verify master-slave status

kubectl -n mysql exec mysql-1 -c mysql -- bash -c "mysql -uroot -p123456 -e 'show slave status \G'"


mysql: [Warning] Using a password on the command line interface can be insecure.
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: mysql-0.mysql.mysql
                  Master_User: root
                  Master_Port: 3306
                Connect_Retry: 10
              Master_Log_File: mysqllog.000003
          Read_Master_Log_Pos: 154
               Relay_Log_File: mysql-1-relay-bin.000002
                Relay_Log_Pos: 319
        Relay_Master_Log_File: mysqllog.000003
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB: 
          Replicate_Ignore_DB: mysql
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 0
                   Last_Error: 
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 154
              Relay_Log_Space: 528
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File: 
           Master_SSL_CA_Path: 
              Master_SSL_Cert: 
            Master_SSL_Cipher: 
               Master_SSL_Key: 
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error: 
               Last_SQL_Errno: 0
               Last_SQL_Error: 
  Replicate_Ignore_Server_Ids: 
             Master_Server_Id: 100
                  Master_UUID: 1bad4d64-6290-11ea-8376-0242ac113802
             Master_Info_File: /var/lib/mysql/master.info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
           Master_Retry_Count: 86400
                  Master_Bind: 
      Last_IO_Error_Timestamp: 
     Last_SQL_Error_Timestamp: 
               Master_SSL_Crl: 
           Master_SSL_Crlpath: 
           Retrieved_Gtid_Set: 
            Executed_Gtid_Set: 
                Auto_Position: 0
         Replicate_Rewrite_DB: 
                 Channel_Name: 
           Master_TLS_Version:

Next, we create databases and tables and insert databases through the Master container

kubectl -n mysql exec mysql- -c mysql -- bash -c "mysql -uroot -p123456 -e 'create database test'"
kubectl -n mysql exec mysql- -c mysql -- bash -c "mysql -uroot -p123456 -e 'use test;create table counter(c int);'"
kubectl -n mysql exec mysql- -c mysql -- bash -c "mysql -uroot -p123456 -e 'use test;insert into counter values(123)'"

Then, we observe whether the Slave nodes are synchronized to the data

kubectl -n mysql exec mysql-1 -c mysql -- bash -c "mysql -uroot -p123456 -e 'use test;select * from counter'"  
c
123

When you see the output result, the master-slave synchronization is normal

Expand slave node

With stateful set, you can easily expand the MySQL Cluster like Deployment, such as

kubectl -n mysql scale statefulset mysql -—replicas=3

$ kubectl get po -n mysql
NAME      READY   STATUS    RESTARTS   AGE
mysql-   2/2     Running             22m
mysql-1   2/2     Running             22m
mysql-2   2/2     Running             20s

At this time, a new mysql-2 is created. We continue to verify whether the newly expanded nodes are synchronized with the data of the primary node

kubectl -n mysql exec mysql-2 -c mysql -- bash -c "mysql -uroot -p123456 -e 'use test;select * from counter'"  
c
123

When you see the output result, the master-slave synchronization is normal. In other words, the previously inserted records can also be read from mysql-2 newly created by StatefulSet. In other words, our data backup and recovery are effective.

Tags: MySQL Kubernetes

Posted on Tue, 23 Nov 2021 02:47:19 -0500 by Xonnie316