MHA high availability cluster deployment and failover

preface

If there is a problem with the master server in the master-slave server, what should I do without affecting the production environment? We need to apply MHA. Let's understand MHA.

1, MHA overview

MHA (MasterHigh Availability) is a set of excellent software for failover and master-slave replication in MySQL high availability environment.
During MySQL failover, MHA can automatically complete the failover operation within 0-30 seconds.

2, Composition of MHA

MHA Node
The MHA Node runs on each MySQL server.
MHA Manager (management node)
MHA Manager can be deployed on an independent machine to manage multiple master slave clusters; It can also be deployed on a slave node
The MHA Manager will periodically probe the master node in the cluster. When the master fails, it can automatically promote the slave of the latest data to the new master, and then point all other slave to the new master again. The entire failover process is completely transparent to the application.

3, Characteristics of MHA

During automatic failover, MHA tries to save binary logs from the down primary server to ensure that data is not lost to the greatest extent
Using semi synchronous replication can greatly reduce the risk of data loss
At present, MHA supports a master-slave architecture with at least three services, that is, one master and two slaves

4, Composition and principle

The software consists of two parts: MHA Manager (management node) and MHA Node (data node)

MHA Manager can be deployed on a separate machine to manage multiple master slave clusters, or on a slave node
The MHA Node runs on each MySQL server. The MHA Manager will regularly detect the master node in the cluster. When the master fails, it can automatically promote the slave of the latest data to the new master, and then point all other slave to the new master again

4, Experimental steps

1. Turn off the firewall

systemctl disable firewalld setenforce 0

2. Modify the host names of Master, Slave1 and Slave2

hostnamectl set-hostname mysql1 su - hostnamectl set-hostname mysql2 su - hostnamectl set-hostname mysql3 su -

3.3 modify the Master configuration file / etc/my.cnf of Master, Slave1 and Slave2

Master:

vim /etc/my.cnf [mysqld] server-id = 20 #The id can be customized. The three hosts cannot be the same log_bin = master-bin log-slave-updates = true #Enable binary log and allow binary files to be copied and updated from the server systemctl restart mysqld #Restart the service for the configuration to take effect ln -s /usr/local/mysql/bin/mysql /usr/sbin/ ln -s /usr/local/mysql/bin/mysqlbinlog /usr/sbin/ #Create a soft link between mysql command and mysql log file to facilitate system identification

Slave1,Slave2:

vim /etc/my.cnf server-id = 30 #Server id = 40 (the server IDs of three servers cannot be the same) log_bin = master-bin relay-log = relay-log-bin relay-log-index = slave-relay-bin.index #The slave server should also enable binary logs, because it may also become the master server systemctl restart mysqld ln -s /usr/local/mysql/bin/mysql /usr/sbin/ ln -s /usr/local/mysql/bin/mysqlbinlog /usr/sbin/

4. Configure MySQL with one master and two slaves

Master,Slave1,Slave2:

mysql -uroot -p123123 #Log in to MySQL database grant replication slave on *.* to 'myslave'@'192.168.126.%' identified by '123123'; #Synchronize use from database grant all privileges on *.* to 'mha'@'192.168.126.%' identified by 'manager'; #manager usage grant all privileges on *.* to 'mha'@'mysql1' identified by 'manager'; grant all privileges on *.* to 'mha'@'mysql2' identified by 'manager'; grant all privileges on *.* to 'mha'@'mysql3' identified by 'manager'; #Prevent the slave library from connecting to the master library through the host name flush privileges;

Master:

show master status; #Check the binary file name and offset in the Master node. Remember, it is useful later

Slave1,Slave2:

change master to master_host='192.168.126.12',master_user='myslave',master_password='123123',master_log_file='master-bin.000001',master_log_pos=xxxx; #Specify the master server IP, user name and password, log name and offset start slave; #Start synchronization show slave status\G" | awk '/Running:/ #View parameter information related to slave threads #Ensure that both IO and SQL threads are Yes, which means the synchronization is normal mysql -e "set global read_only=1;" #Both slave libraries must be set to read-only mode

5 verify master-slave replication

Master:

create database test; show databases; #Create a library for testing

Slave1,Slave2:

show databases; #Query and test whether to synchronize

6 installation of MHA

MHA dependent environments are installed on all servers. First, install the epel source

yum list yum install epel-release --nogpgcheck -y yum install -y perl-DBD-MySQL \ perl-Config-Tiny \ perl-Log-Dispatch \ perl-Parallel-ForkManager \ perl-ExtUtils-CBuilder \ perl-ExtUtils-MakeMaker \ perl-CPAN

cd /opt #Put node in opt directory tar zxvf mha4mysql-node-0.57.tar.gz cd mha4mysql-node-0.57 perl Makefile.PL make && make install #The version of each operating system is different. Here, CentOS 7.4 must select version 0.57 #You must first install the node component on all servers, and finally install the manager component on the MHA manager node, because manager depends on the node component

Install the manager component on the MHA manager node

cd /opt # manager to opt directory tar zxvf mha4mysql-manager-0.57.tar.gz cd mha4mysql-manager-0.57 perl Makefile.PL make && make install

7 configure password less authentication on all servers

Configure password less authentication to all database nodes on the manager node

ssh-keygen -t rsa #Create a key pair for the current user, "- t" specifies the encryption algorithm "rsa", and then press enter to confirm ssh-copy-id 192.168.126.12 ssh-copy-id 192.168.126.13 ssh-copy-id 192.168.126.14 #Import the public key into all MySQL servers, and then enter yes and MySQL database login password

8 configure MHA on the Manager node

On the manager node, copy the relevant scripts to the / usr/local/bin directory

cp -rp /opt/mha4mysql-manager-0.57/samples/scripts /usr/local/bin #After copying, there will be four executables ll /usr/local/bin/scripts/

Copy the above script of VIP management during automatic switching to / usr/local/bin directory. Here, use master_ip_failover scripts to manage VIPs and failover

cp /usr/local/bin/scripts/master_ip_failover /usr/local/bin

The modified contents are as follows: (delete the original contents, directly copy and modify vip related parameters)

vim /usr/local/bin/master_ip_failover #!/usr/bin/env perl use strict; use warnings FATAL => 'all'; use Getopt::Long; my ( $command, $ssh_user, $orig_master_host, $orig_master_ip, $orig_master_port, $new_master_host, $new_master_ip, $new_master_port ); #############################Add content section######################################### my $vip = '192.168.126.200'; #Specify the address of the vip my $brdc = '192.168.126.255'; #Specify the broadcast address of the vip my $ifdev = 'ens33'; #Specify the network card bound by vip my $key = '1'; #Specifies the serial number of the virtual network card bound by the vip my $ssh_start_vip = "/sbin/ifconfig ens33:$key $vip"; #Represents that the value of this variable is ifconfig ens33:1 192.168.126.200 my $ssh_stop_vip = "/sbin/ifconfig ens33:$key down"; #Represents that the value of this variable is ifconfig ens33:1 192.168.126.200 down my $exit_code = 0; #Specifies that the exit status code is 0 #my $ssh_start_vip = "/usr/sbin/ip addr add $vip/24 brd $brdc dev $ifdev label $ifdev:$key;/usr/sbin/arping -q -A -c 1 -I $ifdev $vip;iptables -F;"; #my $ssh_stop_vip = "/usr/sbin/ip addr del $vip/24 dev $ifdev label $ifdev:$key"; ################################################################################## GetOptions( 'command=s' => \$command, 'ssh_user=s' => \$ssh_user, 'orig_master_host=s' => \$orig_master_host, 'orig_master_ip=s' => \$orig_master_ip, 'orig_master_port=i' => \$orig_master_port, 'new_master_host=s' => \$new_master_host, 'new_master_ip=s' => \$new_master_ip, 'new_master_port=i' => \$new_master_port, ); exit &main(); sub main { print "\n\nIN SCRIPT TEST====$ssh_stop_vip==$ssh_start_vip===\n\n"; if ( $command eq "stop" || $command eq "stopssh" ) { my $exit_code = 1; eval { print "Disabling the VIP on old master: $orig_master_host \n"; &stop_vip(); $exit_code = 0; }; if ($@) { warn "Got Error: $@\n"; exit $exit_code; } exit $exit_code; } elsif ( $command eq "start" ) { my $exit_code = 10; eval { print "Enabling the VIP - $vip on the new master - $new_master_host \n"; &start_vip(); $exit_code = 0; }; if ($@) { warn $@; exit $exit_code; } exit $exit_code; } elsif ( $command eq "status" ) { print "Checking the Status of the script.. OK \n"; exit 0; } else { &usage(); exit 1; } } sub start_vip() { `ssh $ssh_user\@$new_master_host \" $ssh_start_vip \"`; } ## A simple system call that disable the VIP on the old_master sub stop_vip() { `ssh $ssh_user\@$orig_master_host \" $ssh_stop_vip \"`; } sub usage { print "Usage: master_ip_failover --command=start|stop|stopssh|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n"; }

Create the MHA software directory and copy the configuration file. Here, use the app1.cnf configuration file to manage the mysql node server

mkdir /etc/masterha cp /opt/mha4mysql-manager-0.57/samples/conf/app1.cnf /etc/masterha #Delete the original content, directly copy and modify the IP address of the node server vim /etc/masterha/app1.cnf [server default] manager_log=/var/log/masterha/app1/manager.log manager_workdir=/var/log/masterha/app1 master_binlog_dir=/usr/local/mysql/data master_ip_failover_script=/usr/local/bin/master_ip_failover master_ip_online_change_script=/usr/local/bin/master_ip_online_change password=manager ping_interval=1 remote_workdir=/tmp repl_password=123123 repl_user=myslave secondary_check_script=/usr/local/bin/masterha_secondary_check -s 192.168.126.13 -s 192.168.126.14 shutdown_script="" ssh_user=root user=mha [server1] hostname=192.168.126.12 port=3306 [server2] candidate_master=1 check_repl_delay=0 hostname=192.168.126.13 port=3306 [server3] hostname=192.168.126.14 port=3306

The detailed explanation is as follows:

[server default] manager_log=/var/log/masterha/app1/manager.log #manager log manager_workdir=/var/log/masterha/app1.log #manager working directory master_binlog_dir=/usr/local/mysql/data/ #The location where the master saves binlog. The path here should be consistent with the path of binlog configured in the master so that MHA can find it master_ip_failover_script=/usr/local/bin/master_ip_failover #Set the switching script for automatic failover, that is, the script above master_ip_online_change_script=/usr/local/bin/master_ip_online_change #Set the switching script for manual switching password=manager #Set the password of the root user in mysql, which is the password of the monitoring user created in the previous article ping_interval=1 #Set the monitoring master database. The interval between sending ping packets is 1 second, which is 3 seconds by default. When there is no response after three attempts, the failover will be performed automatically remote_workdir=/tmp #Set the save location of binlog when remote mysql switches repl_password=123123 #Set the password of the replication user repl_user=myslave #Set the user of the replication user report_script=/usr/local/send_report #Set the script of the alarm sent after switching secondary_check_script=/usr/local/bin/masterha_secondary_check -s 192.168.126.13 -s 192.168.126.14 #Specifies the IP address of the slave server to check shutdown_script="" #Set the script to close the fault host after the fault occurs (the main function of this script is to close the host to prevent brain crack, which is not used here) ssh_user=root #Set ssh login user name user=mha #Set monitoring user root [server1] hostname=192.168.126.12 port=3306 [server2] hostname=192.168.126.13 port=3306 candidate_master=1 #Set as the candidate master. After setting this parameter, the slave database will be promoted to the master database after the master-slave switch occurs, even if the master database is not the latest slave in the cluster check_repl_delay=0 #By default, if a slave lags behind the master by more than 100M of relay logs, MHA will not select the slave as a new master, because the recovery of the slave takes a long time; By setting check_repl_delay=0, MHA triggers the switch. When selecting a new master, the replication delay will be ignored. This parameter is set to candidate_ The host with master = 1 is very useful because the candidate master must be a new master in the process of switching [server3] hostname=192.168.126.14 port=3306

The detailed explanation is as follows:

[server default] manager_log=/var/log/masterha/app1/manager.log #manager log manager_workdir=/var/log/masterha/app1.log #manager working directory master_binlog_dir=/usr/local/mysql/data/ #The location where the master saves binlog. The path here should be consistent with the path of binlog configured in the master so that MHA can find it master_ip_failover_script=/usr/local/bin/master_ip_failover #Set the switching script for automatic failover, that is, the script above master_ip_online_change_script=/usr/local/bin/master_ip_online_change #Set the switching script for manual switching password=manager #Set the password of the root user in mysql, which is the password of the monitoring user created in the previous article ping_interval=1 #Set the monitoring master database. The interval between sending ping packets is 1 second, which is 3 seconds by default. When there is no response after three attempts, the failover will be performed automatically remote_workdir=/tmp #Set the save location of binlog when remote mysql switches repl_password=123123 #Set the password of the replication user repl_user=myslave #Set the user of the replication user report_script=/usr/local/send_report #Set the script of the alarm sent after switching secondary_check_script=/usr/local/bin/masterha_secondary_check -s 192.168.126.13 -s 192.168.126.14 #Specifies the IP address of the slave server to check shutdown_script="" #Set the script to close the fault host after the fault occurs (the main function of this script is to close the host to prevent brain crack, which is not used here) ssh_user=root #Set ssh login user name user=mha #Set monitoring user root [server1] hostname=192.168.126.12 port=3306 [server2] hostname=192.168.126.13 port=3306 candidate_master=1 #Set as the candidate master. After setting this parameter, the slave database will be promoted to the master database after the master-slave switch occurs, even if the master database is not the latest slave in the cluster check_repl_delay=0 #By default, if a slave lags behind the master by more than 100M of relay logs, MHA will not select the slave as a new master, because the recovery of the slave takes a long time; By setting check_repl_delay=0, MHA triggers the switch. When selecting a new master, the replication delay will be ignored. This parameter is set to candidate_ The host with master = 1 is very useful because the candidate master must be a new master in the process of switching [server3] hostname=192.168.126.14 port=3306

The detailed explanation is as follows:

[server default] manager_log=/var/log/masterha/app1/manager.log #manager log manager_workdir=/var/log/masterha/app1.log #manager working directory master_binlog_dir=/usr/local/mysql/data/ #The location where the master saves binlog. The path here should be consistent with the path of binlog configured in the master so that MHA can find it master_ip_failover_script=/usr/local/bin/master_ip_failover #Set the switching script for automatic failover, that is, the script above master_ip_online_change_script=/usr/local/bin/master_ip_online_change #Set the switching script for manual switching password=manager #Set the password of the root user in mysql, which is the password of the monitoring user created in the previous article ping_interval=1 #Set the monitoring master database. The interval between sending ping packets is 1 second, which is 3 seconds by default. When there is no response after three attempts, the failover will be performed automatically remote_workdir=/tmp #Set the save location of binlog when remote mysql switches repl_password=123123 #Set the password of the replication user repl_user=myslave #Set the user of the replication user report_script=/usr/local/send_report #Set the script of the alarm sent after switching secondary_check_script=/usr/local/bin/masterha_secondary_check -s 192.168.126.13 -s 192.168.126.14 #Specifies the IP address of the slave server to check shutdown_script="" #Set the script to close the fault host after the fault occurs (the main function of this script is to close the host to prevent brain crack, which is not used here) ssh_user=root #Set ssh login user name user=mha #Set monitoring user root [server1] hostname=192.168.126.12 port=3306 [server2] hostname=192.168.126.13 port=3306 candidate_master=1 #Set as the candidate master. After setting this parameter, the slave database will be promoted to the master database after the master-slave switch occurs, even if the master database is not the latest slave in the cluster check_repl_delay=0 #By default, if a slave lags behind the master by more than 100M of relay logs, MHA will not select the slave as a new master, because the recovery of the slave takes a long time; By setting check_repl_delay=0, MHA triggers the switch. When selecting a new master, the replication delay will be ignored. This parameter is set to candidate_ The host with master = 1 is very useful because the candidate master must be a new master in the process of switching [server3] hostname=192.168.126.14 port=3306

9 for the first configuration, you need to manually turn on the virtual IP on the Master node

ifconfig ens33:1 192.168.126.200/24 ifconfig #Check it out

10. Test SSH password less authentication on the Manager node. If it is normal, it will output successfully

masterha_check_ssh -conf=/etc/masterha/app1.cnf

11. Test the master-slave connection of MySQL on the Manager node, and the words "MySQL Replication Health is OK" appear at the end, indicating that it is normal

masterha_check_repl -conf=/etc/masterha/app1.cnf

12 start MHA on the Manager node

nohup masterha_manager --conf=/etc/masterha/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null > /var/log/masterha/app1/manager.log 2>&1 & #--remove_dead_master_conf: this parameter indicates that the ip address of the old master database will be removed from the configuration file after the master-slave switch #--manger_log: log storage location #--ignore_last_failover: by default, if MHA detects continuous downtime and the interval between two outages is less than 8 hours, it will not fail. The reason for this restriction is to avoid ping pong effect #This parameter means that the file generated by the last MHA triggered switch is ignored. By default, the MHA will log in the directory after the switch, that is, the log app1.failover.complete file set above. If the file is found in the directory during the next switch, the switch is not allowed to be triggered unless the file is deleted after the first switch

13. By viewing the MHA status, you can see that the current Master is a Master host

masterha_check_status --conf=/etc/masterha/app1.cnf

14. Check the MHA log to see that the current Master is the Master host

cat /var/log/masterha/app1/manager.log | grep "current master"

15 check whether the VIP address 192.168.126.200 of the Master exists. This VIP address will not disappear because the Manager node stops MHA service

ifconfig #To shut down the manager service, you can use the following command: masterha_stop --conf=/etc/masterha/app1.cnf Or it can be used directly kill process ID Close by

5. Fault simulation

Monitor observation logging on the Manager node

tail -f /var/log/masterha/app1/manager.log

Stop the MySQL service on the Master node Master

systemctl stop mysqld

After normal automatic switching once, MHA process will exit; HMA will automatically modify the content of app1.cnf file, delete the down mysql1 node, and check whether mysql2 takes over the VIP

ifconfig

6. Algorithm of failover alternative master database

Generally, the slave library is judged from the (position/GTID). The data is different. It is closest to the slave of the master and becomes an alternative master
If the data is consistent, select an alternative master database according to the order of the configuration file
Set a weight (candidate_master=1), and force the alternative master to be specified according to the weight
By default, if a slave lags behind the master's 100m relay logs, even if it has a weight, it will become invalid
If check_ repl_ If delay = 0, even if it lags behind many logs, it is forced to select it as the alternative primary log

7. Troubleshooting steps

1 repair MySQL

systemctl restart mysqld

2. Repair master-slave

View binary files and synchronization points on the current master database server Slave1(192.168.126.13)

show master status;

Perform synchronization on the original master database server (192.168.126.12)

change master to master_host='192.168.126.13',master_user='myslave',master_password='123123',master_log_file='master-bin.000001',master_log_pos=xxxx; start slave; show slave status\G | awk '/Running:/'

3. Modify the configuration file app1.cnf on the Manager node (add this record again, because it will disappear automatically when it detects failure)

vim /etc/masterha/app1.cnf ...... secondary_check_script=/usr/local/bin/masterha_secondary_check -s 192.168.126.12 -s 192.168.126.14 ...... [server1] hostname=192.168.126.13 port=3306 [server2] candidate_master=1 check_repl_delay=0 hostname=192.168.126.12 port=3306 [server3] hostname=192.168.126.14 port=3306

4 start MHA on the Manager node`

masterha_stop --conf=/etc/masterha/app1.cnf #Restart MHA nohup masterha_manager --conf=/etc/masterha/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null > /var/log/masterha/app1/manager.log 2>&1 & masterha_check_status --conf=/etc/masterha/app1.cnf #Check the MHA status again. You can see that the current master is Slave1

MHA high availability cluster deployment and failover

1. Turn off the firewall

2. Modify the host names of Master, Slave1 and Slave2

3.3 modify the Master configuration file / etc/my.cnf of Master, Slave1 and Slave2

4. Configure MySQL with one master and two slaves

5 verify master-slave replication

6 installation of MHA

7 configure password less authentication on all servers

8 configure MHA on the Manager node

9 for the first configuration, you need to manually turn on the virtual IP on the Master node

10. Test SSH password less authentication on the Manager node. If it is normal, it will output successfully

11. Test the master-slave connection of MySQL on the Manager node, and the words "MySQL Replication Health is OK" appear at the end, indicating that it is normal

12 start MHA on the Manager node

13. By viewing the MHA status, you can see that the current Master is a Master host

14. Check the MHA log to see that the current Master is the Master host

15 check whether the VIP address 192.168.126.200 of the Master exists. This VIP address will not disappear because the Manager node stops MHA service

1 repair MySQL

2. Repair master-slave

3. Modify the configuration file app1.cnf on the Manager node (add this record again, because it will disappear automatically when it detects failure)

4 start MHA on the Manager node`

5 November 2021, 19:48 | Views: 2004

Add new comment

0 comments