Hadoop fully distributed installation -- hadoop-2.7.3

catalogue

1, Installation preparation

  2: Installing hadoop on the master node

3, Installing hadoop from a node

  4, Start hadoop

5, Verify installation

1, Installation preparation

1. Three virtual machines are required: the master node is Hadoop 001, the slave nodes are Hadoop 002 and Hadoop 003;

hadoop001,hadoop002,hadoop003; Is the host name of the virtual machine,

use

hostnamectl --static set-hostname hadoop001

Change the host name;

My virtual machine IP addresses are: Hadoop 001 (192.168.17.131), Hadoop 002 (192.168.17.132), Hadoop 003 (192.168.17.133)

The IP address of the virtual machine can be used

ip addr

see;

2. Each virtual machine is installed with jdk;

jdk installation operation reference: Linux CentOS7 installation jdk_ A person's Niuniu blog - CSDN blog

3. All three virtual machines are configured with password free login;

Secret free login reference: Linux configuration: password free login, stand-alone and full distribution_ A person's Niuniu blog - CSDN blog

4. Close the firewall for each virtual machine;

systemctl stop firewalld.service
systemctl disable firewalld.service

5. Host name mapping is configured for each virtual machine;

Enter hosts

vi /etc/hosts

Add the following

192.168.17.131 hadoop001
192.168.17.132 hadoop002
192.168.17.133 hadoop003

Open hosts with notepad on Windows (Location: C:\Windows\System32\drivers\etc\hosts  ) Add the following

192.168.17.131 hadoop001
192.168.17.132 hadoop002
192.168.17.133 hadoop003

  2: Installing hadoop on the master node

1. Download hadoop-2.7.3.tar.gz;

Baidu online disk link:

Link: https://pan.baidu.com/s/1uQTVMzg8E5QULQTAoppdcQ 
Extraction code: 58 c5

2. Upload hadoop-2.7.3.tar.gz to Hadoop 001,

Just drag hadoop-2.7.3.tar.gz to the box of MobaXterm_Portable.

reference resources Simple use of MobaXterm_Portable a person's Niuniu blog - CSDN blog

3. Decompression and installation

tar -zvxf /tools/hadoop-2.7.3.tar.gz -C /training/

4. Configure environment variables (all three virtual machines should be configured)

vi ~/.bash_profile
#hadoop
export HADOOP_HOME=/training/hadoop-2.7.3
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin

  5. Create tmp directory

mkdir /training/hadoop-2.7.3/tmp

6. Modify the configuration file

Enter the configuration file directory

 cd /training/hadoop-2.7.3/etc/hadoop/

ls view file

Modify profile

1)hadoop-env.sh 

vi hadoop-env.sh

  Just add a jdk path. My path is:

export JAVA_HOME=/training/jdk1.8.0_171

2)hdfs-site.xml

vi hdfs-site.xml

Add the following information between < configuration > < / configuration >:

<property>
 <name>dfs.replication</name>
 <value>2</value>
</property>
<property>
 <name>dfs.permissions</name>
 <value>false</value>
</property>

3)core-site.xml

vi core-site.xml

Add the following information between < configuration > < / configuration >:

<property>
     <name>fs.defaultFS</name>
     <value>hdfs://hadoop001:9000</value>
 </property>
 <property>
     <name>hadoop.tmp.dir</name>
     <value>/training/hadoop-2.7.3/tmp</value>
 </property>

4)mapper-site.xml

vi mapper-site.xml

Add the following information between < configuration > < / configuration >:

<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<!-- Historical server address -->
<property>
<name>mapreduce.jobhistory.address</name>
<value>hadoop001:10020</value>
</property>
<!-- History server web End address -->
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>hadoop001:19888</value>
</property>

5)yarn-site.xml

vi yarn-site.xml

Add the following information between < configuration > < / configuration >:

<!-- Site specific YARN configuration properties -->
<property>
        <name>yarn.resourcemanager.hostname</name>
        <value>hadoop001</value>
</property>
<property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
</property>
<!-- Log aggregation enabled -->
<property>
        <name>yarn.log-aggregation-enable</name>
        <value>true</value>
</property>
<!-- Log retention time is set to 7 days -->
<property>
        <name>yarn.log-aggregation.retain-seconds</name>
        <value>604800</value>
</property>
<!--to configure Log Server -->
<property>
    <name>yarn.log.server.url</name>
    <value>http://hadoop001:19888/jobhistory/logs</value>
</property>

6)slaves

vi slaves

Add the following information:

hadoop002
hadoop003

7. Format nameNode

hdfs namenode -format

The following log information will be printed:
Storage directory /tmp/hadoop-root/dfs/name has been successfully formatted.
Display succeeded!!!

3, Installing hadoop from a node

1. Copy the hadoop environment on hadoop 001 to hadoop 002 and hadoop 003

scp -r /training/hadoop-2.7.3/ root@hadoop002:/training/
scp -r /training/hadoop-2.7.3/ root@hadoop003:/training/

  4, Start hadoop

1. Execute on the primary node Hadoop 001

start-all.sh

Stop hadoop with

stop-all.sh

5, Verify installation

1. The viewing process of the master node includes:
NameNode ResourceMnager SecondaryNameNode


The processes viewed from the node are:
DataNode NodeManager

 

  2. Browser view
HDFS:

http://hadoop001:50070


YARN: 

http://hadoop001:8088

 

Tags: Big Data Hadoop hdfs

Posted on Wed, 01 Dec 2021 09:27:56 -0500 by Slip