catalogue
- 1, Introduction
- 2, MHA services
- 3, Implementation process
- 3.1 prepare Replication environment for Mysql experiment
- 3.2 installation and configuration of MHA
- 3.3 start MHA
- 3.4 testing MHA failover
- 3.5 provide a new slave node to repair the replication cluster
- 3.6 perform the check operation again after the new node is provided
- 3.7 precautions for new node online, failover and recovery
text
1, Introduction
MHA (Master HA) is an open source MySQL high availability program. It provides the automatic master failover function for MySQL master-slave replication architecture. When MHA monitors a master node failure, it will promote the slave node with the latest data to become a new master node. During this period, MHA will avoid consistency problems by obtaining additional information from other slave nodes. MHA also provides the online switching function of the master node, that is, switching the master/slave node on demand.
MHA is a relatively mature MySQL high availability solution developed by Japanese yoshinorim (formerly worked in DeNA and now on FaceBook). MHA can achieve failover within 30 seconds and ensure data consistency as much as possible during failover. At present, Taobao is also developing a similar product TMHA, which supports one master and one slave.
2, MHA services
2.1 service role
The MHA service has two roles, MHA Manager (management node) and MHA node (data node):
MHA Manager:
Generally, multiple master/slave clusters (groups) are deployed separately on an independent machine to manage. Each master/slave cluster is called one application is used to manage and coordinate the whole cluster.
MHA node:
Running on each MySQL server (master/slave/manager), it speeds up failover by monitoring scripts with the functions of parsing and cleaning logs.
It is mainly the agent that receives the instructions sent by the management node. The agent needs to run on each mysql node. In short, node is used to collect bin logs generated from the node server. Compare whether the slave node intended to be promoted to the new master node owns and completes the operation. If it is not sent to the new master node, it will be promoted to the master node after local application.
As can be seen from the above figure, ssh is required within each replication group and between the Manager to realize password free interconnection. Only in this way can the Manager connect smoothly and realize the Master-slave switching function when the Master fails.
2.2 tools provided
MHA will provide many tools and programs, the common of which are as follows:
Manager node:
masterha_check_ssh: MHA dependent ssh environment monitoring tool;
masterha_check_repl: MYSQL replication environment detection tool;
masterga_manager: MHA service main program;
masterha_check_status: MHA operation status detection tool;
masterha_master_monitor: MYSQL master node availability monitoring tool;
masterha_ master_ Switch: Master: node switching tool;
masterha_conf_host: add or delete the configured node;
masterha_stop: tool to close MHA service.
Node: (these tools are usually triggered by the script of MHA Manager without human operation)
save_binary_logs: save and copy the binary logs of the master;
apply_diff_relay_logs: identify different relay log events and apply them to other slave;
purge_relay_logs: clear relay logs (SQL threads will not be blocked);
Custom extension:
secondary_check_script: check the availability of the master through multiple network routes;
master_ip_failover_script: update the master IP used by the application;
report_script: send report;
init_conf_load_script: load initial configuration parameters;
master_ip_online_change_script; Update the IP address of the master node.
2.3 working principle
The working principle of MHA is summarized as follows:
(1) Save binary log events from the master of downtime and crash;
(2) Identify the slave with the latest update;
(3) Apply different relay logs to other slave;
(4) Apply binary log events saved from the master;
(5) Upgrade a slave to a new master;
(6) Use another slave to connect to the new master for replication.
3, Implementation process
3.1 prepare Replication environment for Mysql experiment
3.1.1 relevant configuration
MHA has special requirements for MYSQL replication environment. For example, each node must enable binary log and relay log. Each slave node must display and enable its read only attribute and turn off relay_log_purge function, etc. the configuration is described in advance here.
There are four nodes in the experimental environment, and their roles are assigned as follows (the experimental machines are centos 7.3):
In order to facilitate our later operations, we add the following contents to the configuration contents of the / etc/hosts file of each node:
192.168.37.111 node1.keer.com node1 192.168.37.122 node2.keer.com node2 192.168.37.133 node3.keer.com node3 192.168.37.144 node4.keer.com node4
In this way, we can get through the private key access through the host parsing node, which will be much more convenient.
This step is complete.
3.1.2 configuration of the initial master node
We need to modify the database configuration file of the master to initialize it:
[root@master ~]# vim /etc/my.cnf [mysqld] server-id = 1 //The id of each node in the replication cluster must be unique log-bin = master-log //Open binary log relay-log = relay-log //Enable relay log skip_name_resolve //Turn off name resolution (not required) [root@master ~]# systemctl restart mariadb
This step is complete.
3.1.3 configuration of all slave nodes
We modify the database configuration files of the two slave machines. Both machines do the following:
[root@slave1 ~]# vim /etc/my.cnf [mysqld] server-id = 2 //The id of each node in the replication cluster must be unique; relay-log = relay-log //Enable relay log log-bin = master-log //Open binary log read_only = ON //Enable read-only properties relay_log_purge = 0 //Do you want to automatically clear the relay logs that are no longer needed skip_name_resolve //Turn off name resolution (not required) log_slave_updates = 1 //Causes the updated data to be written to the binary log [root@slave1 ~]# systemctl restart mariadb
[root@slave2 ~]# vim /etc/my.cnf [mysqld] server-id = 3 //The id of each node in the replication cluster must be unique; relay-log = relay-log //Enable relay log log-bin = master-log //Open binary log read_only = ON //Enable read-only properties relay_log_purge = 0 //Do you want to automatically clear the relay logs that are no longer needed skip_name_resolve //Turn off name resolution (not required) log_slave_updates = 1 //Causes the updated data to be written to the binary log [root@slave2 ~]# systemctl restart mariadb
This step is complete.
3.1.4 configure a master-slave replication architecture
The following will only give commands. See my last blog for detailed knowledge and process—— Practical project -- implementation of mysql master-slave architecture.
On the master node:
MariaDB [(none)]>grant replication slave,replication client on *.* to 'slave'@'192.168.%.%' identified by 'keer'; MariaDB [(none)]> show master status;
On the slave node:
MariaDB [(none)]> change master to master_host='192.168.37.122', -> master_user='slave', -> master_password='keer', -> master_log_file='mysql-bin.000001', -> master_log_pos=415; MariaDB [(none)]> start slave; MariaDB [(none)]> show slave status\G;
This step is complete.
3.2 installation and configuration of MHA
3.2.1 authorization on master
On all Mysql nodes, users with administrative permissions are authorized to remotely access on other nodes in the local network. Of course, at this time, you only need to run SQL statements like the following on the master node.
MariaDB [(none)]> grant all on *.* to 'mhaadmin'@'192.168.%.%' identified by 'mhapass';
This step is complete.
3.2.2 prepare ssh interworking environment
All nodes in MHA cluster need to communicate with each other based on ssh mutual trust to realize remote control and data management functions. For simplicity, after the Manager node generates a key pair and sets it to remotely connect to the local host, the private key file and authorized_ The keys file can be copied to all the remaining nodes.
The following operations are performed on all nodes:
[root@manager ~]# ssh-keygen -t rsa [root@manager ~]# ssh-copy-id -i .ssh/id_rsa.pub root@node1
After the above operations are performed on all four machines, we can see the following files on the manager machine:
[root@manager ~]# cd .ssh/ [root@manager .ssh]# ls authorized_keys id_rsa id_rsa.pub known_hosts [root@manager .ssh]# cat authorized_keys
The public keys of the four machines are already authorized_keys. Then, we only need to send this file to the other three machines, which can realize ssh password free interworking:
[root@manager .ssh]# scp authorized_keys root@node2:~/.ssh/ [root@manager .ssh]# scp authorized_keys root@node3:~/.ssh/ [root@manager .ssh]# scp authorized_keys root@node4:~/.ssh/
Of course, we can also experiment on the machine to see if ssh still needs to enter a password.
This step is complete.
3.2.3 installation of MHA package
In this step, the Manager node needs to install an additional package. The specific contents to be installed are as follows:
All four nodes need to be installed: mha4mysql-node-0.56-0.el6.norch.rpm
The Manager node needs to be installed separately: mha4mysql-manager-0.56-0.el6.noarch.rpm
The package I need to install has been uploaded to Baidu cloud disk , password: mkcr, you need to download and use by yourself~
We use the rz command to upload them respectively, and then use yum to install them.
[root@manager ~]# rz [root@manager ~]# ls anaconda-ks.cfg initial-setup-ks.cfg Pictures Desktop mha4mysql-manager-0.56-0.el6.noarch.rpm Public Documents mha4mysql-node-0.56-0.el6.noarch.rpm Templates Downloads Music Videos [root@manager ~]# yum install -y mha4mysql-node-0.56-0.el6.noarch.rpm [root@manager ~]# yum install -y mha4mysql-manager-0.56-0.el6.noarch.rpm
Other machines are also installed separately, so there are no examples one by one.
This step is complete.
3.2.4 initialize MHA and configure it
The Manager node needs to provide a special configuration file for each monitored master/slave cluster, and all master/slave clusters can also share the global configuration. The global configuration file defaults to / etc/masterha_default.cnf, which is an optional configuration. If you only monitor a group of master/slave clusters, you can also directly provide the default configuration information of each server through the configuration of application. The configuration file path of each application is custom. See the next step for details.
3.2.5 define MHA Management Profile
Create a special management user for MHA for future use. On the master node of mysql, the three nodes are automatically synchronized:
mkdir /etc/mha_master vim /etc/mha_master/mha.cnf
The contents of the configuration file are as follows:;
[server default] //It is applicable to the configuration of server1, 2 and 3 servers user=mhaadmin //mha management user password=mhapass //mha management password manager_workdir=/etc/mha_master/app1 //mha_master's own work path manager_log=/etc/mha_master/manager.log // mha_master's own log file remote_workdir=/mydata/mha_master/app1 //Where is the working directory of each remote host ssh_user=root // Key authentication based on ssh repl_user=slave //Database user name repl_password=magedu //Database password ping_interval=1 //ping interval duration [server1] //Node 2 hostname=192.168.37.133 //Node 2 host address ssh_port=22 //ssh port of node 2 candidate_master=1 //Can you become a master candidate node / master node in the future [server2] hostname=192.168.37.133 ssh_port=22 candidate_master=1 [server3] hostname=192.168.37.144 ssh_port=22 candidate_master=1
This step is complete.
3.2.6 detect four nodes
1) Check whether the ssh mutual trust communication configuration between nodes is ok
We enter the following command on the Manager machine to detect:
[root@manager ~]# masterha_check_ssh -conf=/etc/mha_master/mha.cnf
If the last line shows [info]All SSH connection tests passed successfully.
2) Check whether the connection configuration parameters of the managed MySQL replication cluster are OK
[root@manager ~]# masterha_check_repl -conf=/etc/mha_master/mha.cnf
We found that the detection failed. This may be because there is no account on the slave node. Because of this architecture, any slave node may become the master node, so you also need to create an account.
Therefore, we need to perform the following operations on the master node again:
MariaDB [(none)]> grant replication slave,replication client on *.* to 'slave'@'192.168.%.%' identified by 'keer'; MariaDB [(none)]> flush privileges;
After this operation, we run the detection command again:
[root@manager ~]# masterha_check_repl -conf=/etc/mha_master/mha.cnf Thu Nov 23 09:07:08 2017 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping. Thu Nov 23 09:07:08 2017 - [info] Reading application default configuration from /etc/mha_master/mha.cnf.. Thu Nov 23 09:07:08 2017 - [info] Reading server configuration from /etc/mha_master/mha.cnf.. ...... MySQL Replication Health is OK.
It can be seen that our test has been ok.
This step is complete.
3.3 start MHA
We execute the following command on the manager node to start MHA:
[root@manager ~]# nohup masterha_manager -conf=/etc/mha_master/mha.cnf &> /etc/mha_master/manager.log & [1] 7598
After successful startup, let's check the status of the master node:
[root@manager ~]# masterha_check_status -conf=/etc/mha_master/mha.cnf mha (pid:7598) is running(0:PING_OK), master:192.168.37.122
In the above information, "mha (pid:7598) is running(0:PING_OK)" means that the MHA service is running OK. Otherwise, it will be displayed as "mha is stopped(1:NOT_RUNNING)."
If we want to stop MHA, we need to use the stop command:
[root@manager ~]# masterha_stop -conf=/etc/mha_master/mha.cnf
3.4 testing MHA failover
3.4.1 close the mariadb service at the master node to simulate the data crash of the master node
[root@master ~]# killall -9 mysqld mysqld_safe [root@master ~]# rm -rf /var/lib/mysql/*
3.4.2 viewing logs on the manger node
Let's check the log:
[root@manager ~]# tail -200 /etc/mha_master/manager.log ...... Thu Nov 23 09:17:19 2017 - [info] Master failover to 192.168.37.133(192.168.37.133:3306) completed successfully.
Indicates that the manager detects a 192.168.37.122 node failure, and then automatically performs failover to promote 192.168.37.133 to the primary node.
Note that after the failover is completed, the manager will stop automatically, and masterha is used at this time_ check_ The status command will encounter an error prompt, as shown below:
[root@manager ~]# masterha_check_status -conf=/etc/mha_master/mha.cnf mha is stopped(2:NOT_RUNNING).
3.5 provide a new slave node to repair the replication cluster
After the original master node fails, you need to prepare a new MySQL node. After restoring the data based on the backup from the master node, configure it as the slave node of the new master. Note that if the newly added node is a new node, its IP address should be configured as the IP of the original master node. Otherwise, you need to modify the corresponding IP address in mha.cnf. Then start the manager again and test its status again.
We will use the newly closed master as the newly added machine to restore the database:
The original slave1 has become the new main machine, so we make a full backup of it, and then send the backup data to the newly added machine:
[root@slave1 ~]# mkdir /backup [root@slave1 ~]# mysqldump --all-database > /backup/mysql-backup-`date +%F-%T`-all.sql [root@slave1 ~]# scp /backup/mysql-backup-2017-11-23-09\:57\:09-all.sql root@node2:~
Then perform data recovery on node2 node:
[root@master ~]# mysql < mysql-backup-2017-11-23-09\:57\:09-all.sql
The next step is to configure master-slave. Check the binary log and location of the current master as usual, and then make the following settings:
MariaDB [(none)]> change master to master_host='192.168.37.133', master_user='slave', master_password='keer', master_log_file='mysql-bin.000006', master_log_pos=925; MariaDB [(none)]> start slave; MariaDB [(none)]> show slave status\G; Slave_IO_State: Waiting for master to send event Master_Host: 192.168.37.133 Master_User: slave Master_Port: 3306 Connect_Retry: 60 Master_Log_File: mysql-bin.000006 Read_Master_Log_Pos: 925 Relay_Log_File: mysql-relay-bin.000002 Relay_Log_Pos: 529 Relay_Master_Log_File: mysql-bin.000006 Slave_IO_Running: Yes Slave_SQL_Running: Yes ......
It can be seen that our master-slave has been configured.
This step is complete.
3.6 perform the check operation again after the new node is provided
Let's check the status again:
[root@manager ~]# masterha_check_repl -conf=/etc/mha_master/mha.cnf
If an error is reported, authorize again (see details) See above ). If there is no problem, start the manager. Note that the log should be recorded this time:
[root@manager ~]# masterha_manager -conf=/etc/mha_master/mha.cnf > /etc/mha_master/manager.log 2>&1 & [1] 10012
After successful startup, let's check the status of the master node:
[root@manager ~]# masterha_check_status -conf=/etc/mha_master/mha.cnf mha (pid:9561) is running(0:PING_OK), master:192.168.37.133
Our service has been successfully continued.
This step ends.
3.7 precautions for new node online, failover and recovery
1) in the production environment, when your master node hangs up, you must make a backup on the slave node, manually promote the master node to the slave node with the backup file, and indicate the location of the log file from which to copy
2) after each automatic conversion, if the replication health test is not ok, it cannot be started all the time. You must manually repair the master node unless you change the configuration file
3) after the master node is promoted to the slave node manually, run the detection command again
[root@manager ~]# masterha_check_status -conf=/etc/mha_master/mha.cnf mha (pid:9561) is running(0:PING_OK), master:192.168.37.133
4) the recovery is successful after running again
[root@manager ~]# masterha_manager --conf=/etc/mha_master/mha.cnf
Above. Our experiment has been successfully completed.
If you have any questions or suggestions, you can discuss more ~ (~  ̄▽  ̄) ~
Author: Ke Er acridine
source: http://www.cnblogs.com/keerya/
The copyright of this article belongs to the author and the blog park. Reprint is welcome, but this statement must be retained without the consent of the author, and the original connection must be given in an obvious position on the article page, otherwise the right to investigate legal responsibility is reserved.
It's not easy for everyone to write. I hope to respect the fruits of labor~