Linux CentOS 7.5 builds a highly available Hadoop distributed cluster environment

1.Linux environment preparation

1.1 turn off the firewall (all three virtual machines execute)

firewall-cmd --state   #View firewall status
 
systemctl start firewalld.service   #Turn on the firewall
 
systemctl stop firewalld.service     #Turn off firewall
 
systemctl disable firewalld.service  #Do not start firewall

1.2 configure static IP address (all three virtual machines execute)

Attached: Virtual machine Linux configuration static ip

[root@node01 ~]# vim /etc/sysconfig/network-scripts/ifcfg-ens33

Full content:

TYPE="Ethernet"
PROXY_METHOD="none"
BROWSER_ONLY="no"
BOOTPROTO="static"
DEFROUTE="yes"
IPV4_FAILURE_FATAL="no"
IPV6INIT="yes"
IPV6_AUTOCONF="yes"
IPV6_DEFROUTE="yes"
IPV6_FAILURE_FATAL="no"
IPV6_ADDR_GEN_MODE="stable-privacy"
NAME="ens33"
UUID="a5a7540d-fafb-47c8-bd59-70f1f349462e"
DEVICE="ens33"
ONBOOT="yes"

IPADDR="192.168.24.137"
GATEWAY="192.168.24.2"
NETMASK="255.255.255.0"
DNS1="8.8.8.8"

Note:

Here, ONBOOT is set to yes,BOOTPROTO is changed to static, automatic allocation is changed to static ip, and then static ip, gateway, subnet mask and DNS are configured. Other contents are the same for the three virtual machines, and IPADDR is allocated by 137-139.

Question:

At the beginning, my gateway was set to 192.168.24.1. As a result, it's useless to restart the virtual machine and the network card. It's still the same that I can't ping 8.8.8 or Baidu.

solve:

Edit - virtual network editor, enter the interface. Choose your own virtual network card, and you will see whether the subnet address and the ip address you set are the same network segment. Then click NAT settings.

You can see that the subnet mask is 255.255.255.0, and the gateway ip is 192.168.24.2, rather than the 192.168.24.1 I just thought. So I edit / etc / sysconfig / network scripts / ifcfg-ens33 again, change the gateway address to 192.168.24.2, and then restart the network card.

[root@node01 yum.repos.d]# netstat -rn
Kernel IP routing table
Destination     Gateway         Genmask         Flags   MSS Window  irtt Iface
0.0.0.0         192.168.24.2    0.0.0.0         UG        0 0          0 ens33
169.254.0.0     0.0.0.0         255.255.0.0     U         0 0          0 ens33
192.168.24.0    0.0.0.0         255.255.255.0   U         0 0          0 ens33
[root@node01 yum.repos.d]# vim /etc/sysconfig/network-scripts/ifcfg-ens33
[root@node01 yum.repos.d]# systemctl restart network
[root@node01 yum.repos.d]# ping 8.8.8.8
PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data.
64 bytes from 8.8.8.8: icmp_seq=1 ttl=128 time=32.3 ms
64 bytes from 8.8.8.8: icmp_seq=2 ttl=128 time=32.9 ms
64 bytes from 8.8.8.8: icmp_seq=3 ttl=128 time=31.7 ms
64 bytes from 8.8.8.8: icmp_seq=4 ttl=128 time=31.7 ms
64 bytes from 8.8.8.8: icmp_seq=5 ttl=128 time=31.7 ms

1.3 modify hostname (modify all three virtual machines)

Note: use the command edit: vim /ect/sysconfig/network to modify the content: HOSTNAME=node01, which seems not applicable in Linux CentOS 7.5. So here I use the command: hostnamectl set hostname node01 or modify it directly with vim /etc/hostname.

After the modification, it needs to be restarted to take effect. You can use the command reboot

1.4 set ip and domain name mapping (three virtual machines are modified. New part)

192.168.24.137 node01 node01.hadoop.com
192.168.24.138 node02 node02.hadoop.com
192.168.24.139 node03 node03.hadoop.com

1.5 password free login for three machines (all three virtual machines are modified)

Why do I need to log in without password
  -There are many Hadoop nodes, so it is generally necessary to start the slave node in the master node. At this time, the program needs to automatically log in to the slave node in the master node. If it can't be password free, you have to enter the password every time, which is very troublesome
 -Principle of password free SSH login
  1. Configure the public key of node A in node B first
  2. Node a requests node B to log in
  3. Node B uses the public key of node A to encrypt A random text
  4. Node a decrypts with private key and sends it back to node B
  5. Verify whether the text of node B is correct

Step 1: three machines generate public key and private key

Execute the following command on three machines to generate the public key and private key. The command is as follows: SSH keygen - t RSA

Step 2: copy the public key to the same machine

Three machines will copy the public key to the first machine, and three machines will execute the command:

ssh-copy-id node01

[root@node02 ~]# ssh-copy-id node01
/usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/root/.ssh/id_rsa.pub"
The authenticity of host 'node01 (192.168.24.137)' can't be established.
ECDSA key fingerprint is SHA256:GzI3JXtwr1thv7B0pdcvYQSpd98Nj1PkjHnvABgHFKI.
ECDSA key fingerprint is MD5:00:00:7b:46:99:5e:ff:f2:54:84:19:25:2c:63:0a:9e.
Are you sure you want to continue connecting (yes/no)? yes
/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
root@node01's password:

Number of key(s) added: 1

Now try logging into the machine, with:   "ssh 'node01'"
and check to make sure that only the key(s) you wanted were added.

Step 3: copy the certification of the first machine to other machines

Copy the public key of the first machine (192.168.24.137) to other machines (192.168.24.138192.168.24.139), and use the following command on the first machine:

scp /root/.ssh/authorized_keys node02:/root/.ssh

scp /root/.ssh/authorized_keys node03:/root/.ssh1.6

[root@node01 ~]# scp /root/.ssh/authorized_keys node02:/root/.ssh
The authenticity of host 'node02 (192.168.24.138)' can't be established.
ECDSA key fingerprint is SHA256:GzI3JXtwr1thv7B0pdcvYQSpd98Nj1PkjHnvABgHFKI.
ECDSA key fingerprint is MD5:00:00:7b:46:99:5e:ff:f2:54:84:19:25:2c:63:0a:9e.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'node02,192.168.24.138' (ECDSA) to the list of known hosts.
root@node02's password:
authorized_keys                                                       100%  786   719.4KB/s   00:00
[root@node01 ~]# scp /root/.ssh/authorized_keys node03:/root/.ssh
The authenticity of host 'node03 (192.168.24.139)' can't be established.
ECDSA key fingerprint is SHA256:TyZdob+Hr1ZX7WRSeep1saPljafCrfto9UgRWNoN+20.
ECDSA key fingerprint is MD5:53:64:22:86:20:19:da:51:06:f9:a1:a9:a8:96:4f:af.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'node03,192.168.24.139' (ECDSA) to the list of known hosts.
root@node03's password:
authorized_keys                                                       100%  786   692.6KB/s   00:00

You can use the following command to directly detect each other in the three virtual machines, and check whether they are password free

[root@node02 hadoop-2.7.5]# cd ~/.ssh
[root@node02 .ssh]# ssh node01
Last login: Thu Jun 11 10:12:27 2020 from 192.168.24.1
[root@node01 ~]# ssh node02
Last login: Thu Jun 11 14:51:58 2020 from node03

1.6 clock synchronization of three machines (all three virtual machines execute)

Why time synchronization is needed

-Because many distributed systems are stateful, such as storing A data, the time recorded by node A is 1, and the time recorded by node B is 2, which will cause problems
## install
[root@node03 ~]# yum install -y ntp
## Start scheduled task
[root@node03 ~]# crontab -e
no crontab for root - using an empty one
crontab: installing new crontab
## Add the following to the file:
*/1 * * * * /usr/sbin/ntpdate ntp4.aliyun.com;

Note: if you encounter the following error when using the yum install

/var/run/yum.pid  Locked, another program with PID 5396 is running.
Another app is currently holding the yum lock; waiting for it to exit...
Another application is: yum
 Memory: 70 M RSS (514 MB VSZ)
Started: before Thu Jun 11 10:02:10 2020 - 18:48
 Status: trace / stop, process ID: 5396
Another app is currently holding the yum lock; waiting for it to exit...
Another application is: yum
 Memory: 70 M RSS (514 MB VSZ)
Started: Thu Jun 11 before 10:02:10 2020 - 18:50
 Status: trace / stop, process ID: 5396
^Z
 [1] + yum install -y ntp stopped

You can use this command to resolve:

[root@node03 ~]# rm -f /var/run/yum.pid

If you want to replace the yum source of Linux CentOS with the domestic Yum source, you can use the following command:

Attached: centos yum repo domestic image

Alicloud image:

#backups
cp /etc/yum.repos.d/CentOS-Base.repo /etc/yum.repos.d/CentOS-Base.repo.backup
#If your centos is 5
wget -O /etc/yum.repos.d/CentOS-Base.repo http://mirrors.aliyun.com/repo/Centos-5.repo
#If your centos is 6
wget -O /etc/yum.repos.d/CentOS-Base.repo http://mirrors.aliyun.com/repo/Centos-6.repo
yum clean all
yum makecache

163 image :

cp /etc/yum.repos.d/CentOS-Base.repo /etc/yum.repos.d/CentOS-Base.repo.backup
wget -O /etc/yum.repos.d/CentOS-Base.repo http://mirrors.163.com/.help/CentOS5-Base-163.repo
wget -O /etc/yum.repos.d/CentOS-Base.repo http://mirrors.163.com/.help/CentOS6-Base-163.repo
yum clean all
yum makecache

2. Install jdk

2.1 distribution of installation package to other machines

The first machine (192.168.24.37) executes the following two commands:

[root@node01 software]# ls
hadoop-2.7.5  jdk1.8.0_241  zookeeper-3.4.9  zookeeper-3.4.9.tar.gz
[root@node01 software]# java -version
java version "1.8.0_241"
Java(TM) SE Runtime Environment (build 1.8.0_241-b07)
Java HotSpot(TM) 64-Bit Server VM (build 25.241-b07, mixed mode)
[root@node01 software]# scp -r  /software/jdk1.8.0_241/ node02:/software/jdk1.8.0_241/
root@node02's password:
(ellipsis.....)
[root@node01 software]# scp -r  /software/jdk1.8.0_241/ node03:/software/jdk1.8.0_241/
root@node03's password:
(ellipsis.....)

ps: jdk1.8 in my node1 node has been installed and configured. Please refer to: jdk installation

After execution, you can view it on node2 and node3 nodes and find that / software / jdk1.8.0 has been created automatically_ 241 / directory, and transfer the JDK installation package of node1 node to node2 and node3. Then use the following command in node2 and node3 node to configure JDK:

[root@node02 software]# vim /etc/profile
[root@node02 software]# source /etc/profile
[root@node02 software]# java -version
java version "1.8.0_241"
Java(TM) SE Runtime Environment (build 1.8.0_241-b07)
Java HotSpot(TM) 64-Bit Server VM (build 25.241-b07, mixed mode)

What's new in / etc/profile:

export JAVA_HOME=/software/jdk1.8.0_241
export CLASSPATH="$JAVA_HOME/lib"
export PATH="$JAVA_HOME/bin:$PATH"

3.zookeeper cluster installation

Server IP host name Value of myid
192.168.174.100 node01 1
192.168.174.110 node02 2
192.168.174.120 node03 3

3.1 download the compressed package of zookeeper

The download website is as follows: Download address of zookeeper , I use zk version 3.4.9. You can download it using wget.

3.2 decompression

[root@node01 software]# tar -zxvf zookeeper-3.4.9.tar.gz
[root@node01 software]# ls
hadoop-2.7.5  jdk1.8.0_241  zookeeper-3.4.9  zookeeper-3.4.9.tar.gz

3.3 modify configuration file

The first machine (node1) modifies the configuration file

cd /software/zookeeper-3.4.9/conf/

cp zoo_sample.cfg zoo.cfg

mkdir -p /software/zookeeper-3.4.9/zkdatas/

vim zoo.cfg : (new part)

dataDir=/software/zookeeper-3.4.9/zkdatas
# How many snapshots to keep
autopurge.snapRetainCount=3
# How many hours does the log clean up
autopurge.purgeInterval=1
# Server address in cluster
server.1=node01:2888:3888
server.2=node02:2888:3888
server.3=node03:2888:3888

3.4 add myid configuration

Create a file under the path of / software / zookeeper-3.4.9/zkdata / of the first machine (node1). The file name is myid, and the file content is 1. Use the command:

echo 1 > /software/zookeeper-3.4.9/zkdatas/myid

3.5 install package distribution and modify the value of myid

Distribution of installation packages to other machines

The first machine (node1) executes the following two commands

[root@node01 conf]# scp -r  /software/zookeeper-3.4.9/ node02:/software/zookeeper-3.4.9/
root@node02's password:
(ellipsis.....)
[root@node01 conf]# scp -r  /software/zookeeper-3.4.9/ node03:/software/zookeeper-3.4.9/
root@node03's password:
(ellipsis.....)

Change the value of myid to 2 on the second machine

echo 2 > /software/zookeeper-3.4.9/zkdatas/myid

Change the value of myid to 3 on the third machine

echo 3 > /software/zookeeper-3.4.9/zkdatas/myid

3.6 three machines start the zookeeper service (all three virtual machines execute)

#start-up
/software/zookeeper-3.4.9/bin/zkServer.sh start

#View startup status
/software/zookeeper-3.4.9/bin/zkServer.sh status

As shown in the figure:

4. Install and configure hadoop

Fully distributed, high availability of namenode and ResourceManager

 

192.168.24.137

192.168.24.138

192.168.24.139

zookeeper

zk

zk

zk

HDFS

JournalNode

JournalNode

JournalNode

NameNode

NameNode

 

ZKFC

ZKFC

 

DataNode

DataNode

DataNode

YARN

 

ResourceManager

ResourceManager

NodeManager

NodeManager

NodeManager

MapReduce

 

 

JobHistoryServer

  4.1  Linux CentOS 7.5 compiling hadoop source code

Here I do not directly use the packages provided by hadoop, but use the packages compiled by myself. Stop all services of the previous hadoop cluster and delete the hadoop installation package of all machines

[root@localhost software]# cd /software/hadoop-2.7.5-src/hadoop-dist/target
[root@localhost target]# ls
antrun                    hadoop-2.7.5.tar.gz                 javadoc-bundle-options
classes                   hadoop-dist-2.7.5.jar               maven-archiver
dist-layout-stitching.sh  hadoop-dist-2.7.5-javadoc.jar       maven-shared-archive-resources
dist-tar-stitching.sh     hadoop-dist-2.7.5-sources.jar       test-classes
hadoop-2.7.5              hadoop-dist-2.7.5-test-sources.jar  test-dir
[root@localhost target]# cp -r hadoop-2.7.5 /software
[root@localhost target]# cd /software/
[root@localhost software]# ls
apache-maven-3.0.5             findbugs-1.3.9.tar.gz    jdk1.7.0_75                protobuf-2.5.0
apache-maven-3.0.5-bin.tar.gz  hadoop-2.7.5             jdk-7u75-linux-x64.tar.gz  protobuf-2.5.0.tar.gz
apache-tomcat-6.0.53.tar.gz    hadoop-2.7.5-src         mvnrepository              snappy-1.1.1
findbugs-1.3.9                 hadoop-2.7.5-src.tar.gz  mvnrepository.tar.gz       snappy-1.1.1.tar.gz
[root@localhost software]# cd hadoop-2.7.5
[root@localhost hadoop-2.7.5]# ls
bin  etc  include  lib  libexec  LICENSE.txt  NOTICE.txt  README.txt  sbin  share
[root@localhost hadoop-2.7.5]# cd etc
[root@localhost etc]# ls
hadoop
[root@localhost etc]# cd hadoop/
[root@localhost hadoop]# ls
capacity-scheduler.xml      hadoop-policy.xml        kms-log4j.properties        ssl-client.xml.example
configuration.xsl           hdfs-site.xml            kms-site.xml                ssl-server.xml.example
container-executor.cfg      httpfs-env.sh            log4j.properties            yarn-env.cmd
core-site.xml               httpfs-log4j.properties  mapred-env.cmd              yarn-env.sh
hadoop-env.cmd              httpfs-signature.secret  mapred-env.sh               yarn-site.xml
hadoop-env.sh               httpfs-site.xml          mapred-queues.xml.template
hadoop-metrics2.properties  kms-acls.xml             mapred-site.xml.template
hadoop-metrics.properties   kms-env.sh               slaves

Attachment: you can use the notepad + + plug-in: NppFtp to edit files in the remote server:

Find Show NppFTP Window in the following small icons, the result is not found.

Click plug in - plug in management - Search nppftp - check - install.

Open it again and there will be a small icon: click connect, and now you can edit the remote server file.

But here, I don't use the notepad + + plug-in: NppFtp, I use MobaXterm Edit the remote server file.

4.2 modify hadoop configuration file

4.2.1 modify core-site.xml

cd /software/hadoop-2.7.5/etc/hadoop
<configuration>
	<!-- appoint NameNode Of HA Highly available zk address  -->
	<property>
		<name>ha.zookeeper.quorum</name>
		<value>node01:2181,node02:2181,node03:2181</value>
	</property>
	<!-- appoint HDFS Domain name address visited  -->
	<property>
		<name>fs.defaultFS</name>
		<value>hdfs://ns</value>
	</property>
	<!-- Temporary file storage directory  -->
	<property>
		<name>hadoop.tmp.dir</name>
		<value>/software/hadoop-2.7.5/data/tmp</value>
	</property>
	<!-- open hdfs The dustbin mechanism specifies that the files in the dustbin will be deleted completely in seven days
			In minutes
	 -->
	<property>
		<name>fs.trash.interval</name>
		<value>10080</value>
	</property>
</configuration>

4.2.2 modify hdfs-site.xml :

<configuration>
	<!--  Specify namespace  -->
	<property>
		<name>dfs.nameservices</name>
		<value>ns</value>
	</property>
	<!--  Specify two machines under this namespace as our NameNode  -->
	<property>
		<name>dfs.ha.namenodes.ns</name>
		<value>nn1,nn2</value>
	</property>
	<!-- Configure the namenode mailing address  -->
	<property>
		<name>dfs.namenode.rpc-address.ns.nn1</name>
		<value>node01:8020</value>
	</property>
	<!--  Configure the namenode mailing address  -->
	<property>
		<name>dfs.namenode.rpc-address.ns.nn2</name>
		<value>node02:8020</value>
	</property>
	<!-- Address of communication port between all slave nodes -->
	<property>
		<name>dfs.namenode.servicerpc-address.ns.nn1</name>
		<value>node01:8022</value>
	</property>
	<!-- Address of communication port between all slave nodes -->
	<property>
		<name>dfs.namenode.servicerpc-address.ns.nn2</name>
		<value>node02:8022</value>
	</property>
	<!-- First server namenode Of web Access address  -->
	<property>
		<name>dfs.namenode.http-address.ns.nn1</name>
		<value>node01:50070</value>
	</property>
	<!-- Second server namenode Of web Access address  -->
	<property>
		<name>dfs.namenode.http-address.ns.nn2</name>
		<value>node02:50070</value>
	</property>
	<!-- journalNode Note that this address must be configured -->
	<property>
		<name>dfs.namenode.shared.edits.dir</name>
		<value>qjournal://node01:8485;node02:8485;node03:8485/ns1</value>
	</property>
	<!--  Specify which is used by failback java class -->
	<property>
		<name>dfs.client.failover.proxy.provider.ns</name>
		<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
	</property>
	<!-- What kind of communication mechanism does failover use -->
	<property>
		<name>dfs.ha.fencing.methods</name>
		<value>sshfence</value>
	</property>
	<!-- Specifies the public key used by the communication  -->
	<property>
		<name>dfs.ha.fencing.ssh.private-key-files</name>
		<value>/root/.ssh/id_rsa</value>
	</property>
	<!-- journalNode Data storage address  -->
	<property>
		<name>dfs.journalnode.edits.dir</name>
		<value>/software/hadoop-2.7.5/data/dfs/jn</value>
	</property>
	<!-- Enable automatic failback -->
	<property>
		<name>dfs.ha.automatic-failover.enabled</name>
		<value>true</value>
	</property>
	<!-- namenode Generated file storage path -->
	<property>
		<name>dfs.namenode.name.dir</name>
		<value>file:///software/hadoop-2.7.5/data/dfs/nn/name</value>
	</property>
	<!-- edits Generated file storage path -->
	<property>
		<name>dfs.namenode.edits.dir</name>
		<value>file:///software/hadoop-2.7.5/data/dfs/nn/edits</value>
	</property>
	<!-- dataNode File storage path -->
	<property>
		<name>dfs.datanode.data.dir</name>
		<value>file:///software/hadoop-2.7.5/data/dfs/dn</value>
	</property>
	<!-- close hdfs File permissions for -->
	<property>
		<name>dfs.permissions</name>
		<value>false</value>
	</property>
	<!-- appoint block Size of file block -->
	<property>
		<name>dfs.blocksize</name>
		<value>134217728</value>
	</property>
</configuration>

4.2.3 modify yarn-site.xml

Note: the configuration of node03 and node02 is different

<configuration>
	<!-- Site specific YARN configuration properties -->
	<!-- Enable log aggregation.After the application is completed,Log rollup collects logs for each container,These logs are moved to the file system,for example HDFS. -->
	<!-- Users can configure"yarn.nodemanager.remote-app-log-dir","yarn.nodemanager.remote-app-log-dir-suffix"To determine where logs are moved -->
	<!-- Users can access logs through the application time server -->
	<!-- Enable the function of log aggregation. After the application is completed, the logs of each node are collected together for easy viewing -->
	<property>
		<name>yarn.log-aggregation-enable</name>
		<value>true</value>
	</property>
	<!--open resource manager HA,Default is false-->
	<property>
		<name>yarn.resourcemanager.ha.enabled</name>
		<value>true</value>
	</property>
	<!-- Clustered Id,Use this value to ensure RM It will not be used as another cluster active -->
	<property>
		<name>yarn.resourcemanager.cluster-id</name>
		<value>mycluster</value>
	</property>
	<!--to configure resource manager  name-->
	<property>
		<name>yarn.resourcemanager.ha.rm-ids</name>
		<value>rm1,rm2</value>
	</property>
	<!-- Configure the resourceManager -->
	<property>
		<name>yarn.resourcemanager.hostname.rm1</name>
		<value>node03</value>
	</property>
	<!-- Configure the resourceManager -->
	<property>
		<name>yarn.resourcemanager.hostname.rm2</name>
		<value>node02</value>
	</property>
	<!-- Configure the resourceManager mailing address -->
	<property>
		<name>yarn.resourcemanager.address.rm1</name>
		<value>node03:8032</value>
	</property>
	<property>
		<name>yarn.resourcemanager.scheduler.address.rm1</name>
		<value>node03:8030</value>
	</property>
	<property>
		<name>yarn.resourcemanager.resource-tracker.address.rm1</name>
		<value>node03:8031</value>
	</property>
	<property>
		<name>yarn.resourcemanager.admin.address.rm1</name>
		<value>node03:8033</value>
	</property>
	<property>
		<name>yarn.resourcemanager.webapp.address.rm1</name>
		<value>node03:8088</value>
	</property>
	<!-- Configure the resourceManager mailing address -->
	<property>
		<name>yarn.resourcemanager.address.rm2</name>
		<value>node02:8032</value>
	</property>
	<property>
		<name>yarn.resourcemanager.scheduler.address.rm2</name>
		<value>node02:8030</value>
	</property>
	<property>
		<name>yarn.resourcemanager.resource-tracker.address.rm2</name>
		<value>node02:8031</value>
	</property>
	<property>
		<name>yarn.resourcemanager.admin.address.rm2</name>
		<value>node02:8033</value>
	</property>
	<property>
		<name>yarn.resourcemanager.webapp.address.rm2</name>
		<value>node02:8088</value>
	</property>
	<!--open resourcemanager Automatic recovery function-->
	<property>
		<name>yarn.resourcemanager.recovery.enabled</name>
		<value>true</value>
	</property>
	<!--stay node3 Top configuration rm1,stay node2 Top configuration rm2,Note: generally, you like to copy the configured files remotely to other machines, but this is in the YARN Must be modified on another machine of. This item is not configured on other machines-->
	<property>
		<name>yarn.resourcemanager.ha.id</name>
		<value>rm1</value>
		<description>If we want to launch more than one RM in single node, we need this configuration</description>
	</property>
	<!--Class for persistent storage. Try to open-->
	<property>
		<name>yarn.resourcemanager.store.class</name>
		<value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value>
	</property>
	<property>
		<name>yarn.resourcemanager.zk-address</name>
		<value>node02:2181,node03:2181,node01:2181</value>
		<description>For multiple zk services, separate them with comma</description>
	</property>
	<!--open resourcemanager Fail over, designated machine-->
	<property>
		<name>yarn.resourcemanager.ha.automatic-failover.enabled</name>
		<value>true</value>
		<description>Enable automatic failover; By default, it is enabled only when HA is enabled.</description>
	</property>
	<property>
		<name>yarn.client.failover-proxy-provider</name>
		<value>org.apache.hadoop.yarn.client.ConfiguredRMFailoverProxyProvider</value>
	</property>
	<!-- Allow the most CPU Number of cores, 8 by default -->
	<property>
		<name>yarn.nodemanager.resource.cpu-vcores</name>
		<value>4</value>
	</property>
	<!-- Memory available per node,Company MB -->
	<property>
		<name>yarn.nodemanager.resource.memory-mb</name>
		<value>512</value>
	</property>
	<!-- Minimum memory can be applied for a single task, default is 1024 MB -->
	<property>
		<name>yarn.scheduler.minimum-allocation-mb</name>
		<value>512</value>
	</property>
	<!-- Maximum memory can be applied for a single task, 8192 by default MB -->
	<property>
		<name>yarn.scheduler.maximum-allocation-mb</name>
		<value>512</value>
	</property>
	<!--How often do I aggregate and delete logs here-->
	<property>
		<name>yarn.log-aggregation.retain-seconds</name>
		<value>2592000</value>
		<!--30 day-->
	</property>
	<!--Time to keep user logs for a few seconds. Only if log aggregation is disabled-->
	<property>
		<name>yarn.nodemanager.log.retain-seconds</name>
		<value>604800</value>
		<!--7 day-->
	</property>
	<!--Specifies the file compression type used to compress the rollup log-->
	<property>
		<name>yarn.nodemanager.log-aggregation.compression-type</name>
		<value>gz</value>
	</property>
	<!-- nodemanager Local file storage directory-->
	<property>
		<name>yarn.nodemanager.local-dirs</name>
		<value>/software/hadoop-2.7.5/yarn/local</value>
	</property>
	<!-- resourceManager  Save the maximum number of tasks completed -->
	<property>
		<name>yarn.resourcemanager.max-completed-applications</name>
		<value>1000</value>
	</property>
	<!-- Comma separated list of services. The list name should only contain a-zA-Z0-9_,Can't start with numbers-->
	<property>
		<name>yarn.nodemanager.aux-services</name>
		<value>mapreduce_shuffle</value>
	</property>
	<!--rm Time of re linking after loss of connection-->
	<property>
		<name>yarn.resourcemanager.connect.retry-interval.ms</name>
		<value>2000</value>
	</property>
</configuration>

4.2.4 modify mapred-site.xml

<configuration>
	<!--Specify run mapreduce The environment is yarn -->
	<property>
		<name>mapreduce.framework.name</name>
		<value>yarn</value>
	</property>
	<!-- MapReduce JobHistory Server IPC host:port -->
	<property>
		<name>mapreduce.jobhistory.address</name>
		<value>node03:10020</value>
	</property>
	<!-- MapReduce JobHistory Server Web UI host:port -->
	<property>
		<name>mapreduce.jobhistory.webapp.address</name>
		<value>node03:19888</value>
	</property>
	<!-- The directory where MapReduce stores control files.default ${hadoop.tmp.dir}/mapred/system -->
	<property>
		<name>mapreduce.jobtracker.system.dir</name>
		<value>/software/hadoop-2.7.5/data/system/jobtracker</value>
	</property>
	<!-- The amount of memory to request from the scheduler for each map task. Default 1024-->
	<property>
		<name>mapreduce.map.memory.mb</name>
		<value>1024</value>
	</property>
	<!-- <property>
                <name>mapreduce.map.java.opts</name>
                <value>-Xmx1024m</value>
        </property> -->
	<!-- The amount of memory to request from the scheduler for each reduce task. Default 1024-->
	<property>
		<name>mapreduce.reduce.memory.mb</name>
		<value>1024</value>
	</property>
	<!-- <property>
               <name>mapreduce.reduce.java.opts</name>
               <value>-Xmx2048m</value>
        </property> -->
	<!-- The total amount of cache memory, in megabytes, used to store files. By default, assigned to each merge stream 1 MB,A merge stream should be minimized. Default value 100-->
	<property>
		<name>mapreduce.task.io.sort.mb</name>
		<value>100</value>
	</property>
	<!-- <property>
        <name>mapreduce.jobtracker.handler.count</name>
        <value>25</value>
        </property>-->
	<!-- The number of streams used for merging when defragmenting files. This determines the number of open file handles. Default 10-->
	<property>
		<name>mapreduce.task.io.sort.factor</name>
		<value>10</value>
	</property>
	<!-- The default parallel traffic is determined by reduce stay copy(shuffle)Stage. Default 5-->
	<property>
		<name>mapreduce.reduce.shuffle.parallelcopies</name>
		<value>25</value>
	</property>
	<property>
		<name>yarn.app.mapreduce.am.command-opts</name>
		<value>-Xmx1024m</value>
	</property>
	<!-- MR AppMaster The total amount of memory required. Default 1536-->
	<property>
		<name>yarn.app.mapreduce.am.resource.mb</name>
		<value>1536</value>
	</property>
	<!-- MapReduce Local directory where intermediate data files are stored. Ignored if directory does not exist. Default ${hadoop.tmp.dir}/mapred/local-->
	<property>
		<name>mapreduce.cluster.local.dir</name>
		<value>/software/hadoop-2.7.5/data/system/local</value>
	</property>
</configuration>

4.2.5 modifying slaves

node01
node02
node03

4.2.6 modify hadoop-env.sh

export JAVA_HOME=/software/jdk1.8.0_241

4.2.7 send the installation package of hadoop of the first machine (node1) to other machines

[root@node01 software]# ls
hadoop-2.7.5  jdk1.8.0_241  zookeeper-3.4.9  zookeeper-3.4.9.tar.gz
[root@node01 software]# scp -r hadoop-2.7.5/ node02:$PWD
root@node02's password:
(ellipsis.....)
[root@node01 software]# scp -r hadoop-2.7.5/ node03:$PWD
root@node03's password:
(ellipsis.....)

4.2.8 create directory (create all three virtual machines)

mkdir -p /software/hadoop-2.7.5/data/dfs/nn/name
mkdir -p /software/hadoop-2.7.5/data/dfs/nn/edits
mkdir -p /software/hadoop-2.7.5/data/dfs/nn/name
mkdir -p /software/hadoop-2.7.5/data/dfs/nn/edits

As shown in the figure:

4.2.9 change yarn in node02 and node03- site.xml

node01: note out yarn.resourcemanager.ha.id

<!-- 
<property>
	<name>yarn.resourcemanager.ha.id</name>
	<value>rm1</value>
	<description>If we want to launch more than one RM in single node, we need this configuration</description>
</property>
-->

node02:

<property>
	<name>yarn.resourcemanager.ha.id</name>
	<value>rm2</value>
	<description>If we want to launch more than one RM in single node, we need this configuration</description>
</property>

node03:

<property>
	<name>yarn.resourcemanager.ha.id</name>
	<value>rm1</value>
	<description>If we want to launch more than one RM in single node, we need this configuration</description>
</property>

5. Start hadoop

5.1 start HDFS process

The node01 machine executes the following command:

bin/hdfs zkfc -formatZK

sbin/hadoop-daemons.sh start journalnode

bin/hdfs namenode -format

bin/hdfs namenode -initializeSharedEdits -force

sbin/start-dfs.sh

If the following problems are encountered during command execution, the virtual machine password free login is not configured, or there is a configuration problem:

[root@node01 hadoop-2.7.5]# sbin/hadoop-daemons.sh start journalnode
The authenticity of host 'node01 (192.168.24.137)' can't be established.
ECDSA key fingerprint is SHA256:GzI3JXtwr1thv7B0pdcvYQSpd98Nj1PkjHnvABgHFKI.
ECDSA key fingerprint is MD5:00:00:7b:46:99:5e:ff:f2:54:84:19:25:2c:63:0a:9e.
Are you sure you want to continue connecting (yes/no)? root@node02's password: root@node03's password: Please type 'yes' or 'no':
node01: Warning: Permanently added 'node01' (ECDSA) to the list of known hosts.
root@node01's password:
node02: starting journalnode, logging to /software/hadoop-2.7.5/logs/hadoop-root-journalnode-node02.out


root@node03's password: node03: Permission denied, please try again.

root@node01's password: node01: Permission denied, please try again.

The node02 machine executes the following command:

[root@node02 software]# cd hadoop-2.7.5/
[root@node02 hadoop-2.7.5]# bin/hdfs namenode -bootstrapStandby
(ellipsis....)
[root@node02 hadoop-2.7.5]# sbin/hadoop-daemon.sh start namenode
(ellipsis....)

5.2 start the yarn process

The node02 and node03 machines execute the following commands:

[root@node03 software]# cd hadoop-2.7.5/
[root@node03 hadoop-2.7.5]# sbin/start-yarn.sh
[root@node02 hadoop-2.7.5]# sbin/start-yarn.sh
starting yarn daemons
resourcemanager running as process 11740. Stop it first.
The authenticity of host 'node02 (192.168.24.138)' can't be established.
ECDSA key fingerprint is SHA256:GzI3JXtwr1thv7B0pdcvYQSpd98Nj1PkjHnvABgHFKI.
ECDSA key fingerprint is MD5:00:00:7b:46:99:5e:ff:f2:54:84:19:25:2c:63:0a:9e.
Are you sure you want to continue connecting (yes/no)? node01: nodemanager running as process 15655. Stop it first.
node03: nodemanager running as process 13357. Stop it first.

During startup, if the above error message is encountered, use the following command to solve it:

Most blogs on the Internet (not recommended, seems to have problems) are the following command: this script is deprecated. Instead use stop will be prompted- dfs.sh and stop- yarn.sh

#The process is already running, execute stop first- all.sh Next, and then start-all.sh
[root@node02 sbin]# pwd
/software/hadoop-2.7.5/sbin
[root@node02 sbin]# ./stop-all.sh
[root@node02 sbin]# ./start-all.sh

However, the command has been abandoned. Now use:

 ./stop-yarn.sh
 ./stop-dfs.sh

 ./start-yarn.sh
 ./start-dfs.sh
[root@node03 sbin]# ./start-dfs.sh
Starting namenodes on [node01 node02]
node02: starting namenode, logging to /software/hadoop-2.7.5/logs/hadoop-root-namenode-node02.out
node01: starting namenode, logging to /software/hadoop-2.7.5/logs/hadoop-root-namenode-node01.out
node02: starting datanode, logging to /software/hadoop-2.7.5/logs/hadoop-root-datanode-node02.out
node01: starting datanode, logging to /software/hadoop-2.7.5/logs/hadoop-root-datanode-node01.out
node03: starting datanode, logging to /software/hadoop-2.7.5/logs/hadoop-root-datanode-node03.out
Starting journal nodes [node01 node02 node03]
node02: starting journalnode, logging to /software/hadoop-2.7.5/logs/hadoop-root-journalnode-node02.out
node01: starting journalnode, logging to /software/hadoop-2.7.5/logs/hadoop-root-journalnode-node01.out
node03: starting journalnode, logging to /software/hadoop-2.7.5/logs/hadoop-root-journalnode-node03.out
Starting ZK Failover Controllers on NN hosts [node01 node02]
node01: starting zkfc, logging to /software/hadoop-2.7.5/logs/hadoop-root-zkfc-node01.out
node02: starting zkfc, logging to /software/hadoop-2.7.5/logs/hadoop-root-zkfc-node02.out
[root@node03 sbin]# ./start-yarn.sh
starting yarn daemons
starting resourcemanager, logging to /software/hadoop-2.7.5/logs/yarn-root-resourcemanager-node03.out
node01: starting nodemanager, logging to /software/hadoop-2.7.5/logs/yarn-root-nodemanager-node01.out
node02: starting nodemanager, logging to /software/hadoop-2.7.5/logs/yarn-root-nodemanager-node02.out
node03: starting nodemanager, logging to /software/hadoop-2.7.5/logs/yarn-root-nodemanager-node03.out

Note: use jps command to view three virtual machines:

Problem: there is a problem. NameNode is missing from all three virtual machines, and NameNode is not up

node01: 

[root@node01 hadoop-2.7.5]# jps
8083 NodeManager
8531 DFSZKFailoverController
8404 JournalNode
9432 Jps
1467 QuorumPeerMain
8235 NameNode

node02: 

[root@node02 sbin]# jps
7024 NodeManager
7472 DFSZKFailoverController
7345 JournalNode
7176 NameNode
8216 ResourceManager
8793 Jps
1468 QuorumPeerMain

node03:

[root@node03 hadoop-2.7.5]# jps
5349 NodeManager
5238 ResourceManager
6487 JobHistoryServer
6647 Jps
5997 JournalNode

solve:

(1) Use stop first- dfs.sh And stop-yarn.sh Stop the service: (execute at any node)

[root@node03 hadoop-2.7.5]# ./sbin/stop-dfs.sh
Stopping namenodes on [node01 node02]
node02: no namenode to stop
node01: no namenode to stop
node02: no datanode to stop
node01: no datanode to stop
node03: no datanode to stop
Stopping journal nodes [node01 node02 node03]
node02: no journalnode to stop
node01: no journalnode to stop
node03: no journalnode to stop
Stopping ZK Failover Controllers on NN hosts [node01 node02]
node02: no zkfc to stop
node01: no zkfc to stop
[root@node03 hadoop-2.7.5]# ./sbin/stop-yarn.sh
stopping yarn daemons
stopping resourcemanager
node01: stopping nodemanager
node02: stopping nodemanager
node03: stopping nodemanager
no proxyserver to stop

(2) Delete the files in the data storage path of the dataNode (delete them in all three virtual machines)

According to the configuration, you need to delete the file under / software/hadoop-2.7.5/data/dfs/dn:

(3) Use start-dfs.sh And start-yarn.sh Start the service again (just any node)

[root@node01 hadoop-2.7.5]# rm -rf data/dfs/dn
[root@node01 hadoop-2.7.5]# sbin/start-dfs.sh
Starting namenodes on [node01 node02]
node02: starting namenode, logging to /software/hadoop-2.7.5/logs/hadoop-root-namenode-node02.out
node01: starting namenode, logging to /software/hadoop-2.7.5/logs/hadoop-root-namenode-node01.out
node02: starting datanode, logging to /software/hadoop-2.7.5/logs/hadoop-root-datanode-node02.out
node03: starting datanode, logging to /software/hadoop-2.7.5/logs/hadoop-root-datanode-node03.out
node01: starting datanode, logging to /software/hadoop-2.7.5/logs/hadoop-root-datanode-node01.out
Starting journal nodes [node01 node02 node03]
node02: starting journalnode, logging to /software/hadoop-2.7.5/logs/hadoop-root-journalnode-node02.out
node03: starting journalnode, logging to /software/hadoop-2.7.5/logs/hadoop-root-journalnode-node03.out
node01: starting journalnode, logging to /software/hadoop-2.7.5/logs/hadoop-root-journalnode-node01.out
Starting ZK Failover Controllers on NN hosts [node01 node02]
node02: starting zkfc, logging to /software/hadoop-2.7.5/logs/hadoop-root-zkfc-node02.out
node01: starting zkfc, logging to /software/hadoop-2.7.5/logs/hadoop-root-zkfc-node01.out
//You have new messages in / var/spool/mail/root
[root@node01 hadoop-2.7.5]# sbin/start-yarn.sh
starting yarn daemons
starting resourcemanager, logging to /software/hadoop-2.7.5/logs/yarn-root-resourcemanager-node01.out
node02: starting nodemanager, logging to /software/hadoop-2.7.5/logs/yarn-root-nodemanager-node02.out
node03: starting nodemanager, logging to /software/hadoop-2.7.5/logs/yarn-root-nodemanager-node03.out
node01: starting nodemanager, logging to /software/hadoop-2.7.5/logs/yarn-root-nodemanager-node01.out

Use jps again to view three virtual machines: (right now)

node01:

[root@node01 dfs]# jps
10561 NodeManager
9955 DataNode
10147 JournalNode
9849 NameNode
10762 Jps
1467 QuorumPeerMain
10319 DFSZKFailoverController

node02:

[root@node02 hadoop-2.7.5]# jps
9744 NodeManager
9618 DFSZKFailoverController
9988 Jps
9367 NameNode
8216 ResourceManager
9514 JournalNode
1468 QuorumPeerMain
9439 DataNode

node03:

[root@node03 hadoop-2.7.5]# jps
7953 Jps
7683 JournalNode
6487 JobHistoryServer
7591 DataNode
7784 NodeManager

5.3 view resource manager status

node03 execute:

[root@node03 hadoop-2.7.5]# bin/yarn rmadmin -getServiceState rm1
active

node02 execute:

[root@node02 hadoop-2.7.5]# bin/yarn rmadmin -getServiceState rm2
standby

5.4 start jobHistory

node03:

[root@node03 hadoop-2.7.5]# sbin/mr-jobhistory-daemon.sh start historyserver
starting historyserver, logging to /software/hadoop-2.7.5/logs/mapred-root-historyserver-node03.out

5.5 hdfs status view

node01: (partial screenshot)

Browser access: http://192.168.24.137:50070/dfshealth.html#tab-overview

node02: (partial screenshot)

Browser access: http://192.168.24.138:50070/dfshealth.html#tab-overview

5.6 access to yarn cluster

Browser access: http://192.168.24.139:8088/cluster/nodes

5.7 historical task browsing interface

6.hadoop command line

To delete a file:

[root@node01 bin]# ./hdfs dfs -rm /a.txt
20/06/12 14:33:30 INFO fs.TrashPolicyDefault: Namenode trash configuration: Deletion interval = 10080 minutes, Emptier interval = 0 minutes.
20/06/12 14:33:30 INFO fs.TrashPolicyDefault: Moved: 'hdfs://ns/a.txt' to trash at: hdfs://ns/user/root/.Trash/Current/a.txt
Moved: 'hdfs://ns/a.txt' to trash at: hdfs://ns/user/root/.Trash/Current

Create folder:

[root@node01 bin]# ./hdfs dfs -mkdir /dir

Upload file:

[root@node01 bin]# ./hdfs dfs -put /software/a.txt /dir

Screenshot:

Note:

Clicking Download actually gives me access to http://node02:50075... Something. If you don't configure hosts, you can't open it. Just change node02 to ip. I use node01, node02, and node3 in the virtual machine, which can be accessed directly.

So I changed the hosts file of the host to be consistent with the hosts of the virtual machine. Click Download again to Download directly.

So far, the hadoop distributed environment has been built successfully.

 

Tags: Hadoop NodeManager yum Zookeeper

Posted on Sun, 14 Jun 2020 05:40:55 -0400 by chris9902