catalogue
2: Installing hadoop on the master node
3, Installing hadoop from a node
1, Installation preparation
1. Three virtual machines are required: the master node is Hadoop 001, the slave nodes are Hadoop 002 and Hadoop 003;
hadoop001,hadoop002,hadoop003; Is the host name of the virtual machine,
use
hostnamectl --static set-hostname hadoop001
Change the host name;
My virtual machine IP addresses are: Hadoop 001 (192.168.17.131), Hadoop 002 (192.168.17.132), Hadoop 003 (192.168.17.133)
The IP address of the virtual machine can be used
ip addr
see;
2. Each virtual machine is installed with jdk;
jdk installation operation reference: Linux CentOS7 installation jdk_ A person's Niuniu blog - CSDN blog
3. All three virtual machines are configured with password free login;
Secret free login reference: Linux configuration: password free login, stand-alone and full distribution_ A person's Niuniu blog - CSDN blog
4. Close the firewall for each virtual machine;
systemctl stop firewalld.service systemctl disable firewalld.service
5. Host name mapping is configured for each virtual machine;
Enter hosts
vi /etc/hosts
Add the following
192.168.17.131 hadoop001 192.168.17.132 hadoop002 192.168.17.133 hadoop003
Open hosts with notepad on Windows (Location: C:\Windows\System32\drivers\etc\hosts ) Add the following
192.168.17.131 hadoop001 192.168.17.132 hadoop002 192.168.17.133 hadoop003
2: Installing hadoop on the master node
1. Download hadoop-2.7.3.tar.gz;
Baidu online disk link:
Link: https://pan.baidu.com/s/1uQTVMzg8E5QULQTAoppdcQ Extraction code: 58 c5
2. Upload hadoop-2.7.3.tar.gz to Hadoop 001,
Just drag hadoop-2.7.3.tar.gz to the box of MobaXterm_Portable.
reference resources Simple use of MobaXterm_Portable a person's Niuniu blog - CSDN blog
3. Decompression and installation
tar -zvxf /tools/hadoop-2.7.3.tar.gz -C /training/
4. Configure environment variables (all three virtual machines should be configured)
vi ~/.bash_profile
#hadoop export HADOOP_HOME=/training/hadoop-2.7.3 export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
5. Create tmp directory
mkdir /training/hadoop-2.7.3/tmp
6. Modify the configuration file
Enter the configuration file directory
cd /training/hadoop-2.7.3/etc/hadoop/
ls view file
Modify profile
1)hadoop-env.sh
vi hadoop-env.sh
Just add a jdk path. My path is:
export JAVA_HOME=/training/jdk1.8.0_171
2)hdfs-site.xml
vi hdfs-site.xml
Add the following information between < configuration > < / configuration >:
<property> <name>dfs.replication</name> <value>2</value> </property> <property> <name>dfs.permissions</name> <value>false</value> </property>
3)core-site.xml
vi core-site.xml
Add the following information between < configuration > < / configuration >:
<property> <name>fs.defaultFS</name> <value>hdfs://hadoop001:9000</value> </property> <property> <name>hadoop.tmp.dir</name> <value>/training/hadoop-2.7.3/tmp</value> </property>
4)mapper-site.xml
vi mapper-site.xml
Add the following information between < configuration > < / configuration >:
<property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> <!-- Historical server address --> <property> <name>mapreduce.jobhistory.address</name> <value>hadoop001:10020</value> </property> <!-- History server web End address --> <property> <name>mapreduce.jobhistory.webapp.address</name> <value>hadoop001:19888</value> </property>
5)yarn-site.xml
vi yarn-site.xml
Add the following information between < configuration > < / configuration >:
<!-- Site specific YARN configuration properties --> <property> <name>yarn.resourcemanager.hostname</name> <value>hadoop001</value> </property> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <!-- Log aggregation enabled --> <property> <name>yarn.log-aggregation-enable</name> <value>true</value> </property> <!-- Log retention time is set to 7 days --> <property> <name>yarn.log-aggregation.retain-seconds</name> <value>604800</value> </property> <!--to configure Log Server --> <property> <name>yarn.log.server.url</name> <value>http://hadoop001:19888/jobhistory/logs</value> </property>
6)slaves
vi slaves
Add the following information:
hadoop002 hadoop003
7. Format nameNode
hdfs namenode -format
The following log information will be printed:
Storage directory /tmp/hadoop-root/dfs/name has been successfully formatted.
Display succeeded!!!
3, Installing hadoop from a node
1. Copy the hadoop environment on hadoop 001 to hadoop 002 and hadoop 003
scp -r /training/hadoop-2.7.3/ root@hadoop002:/training/ scp -r /training/hadoop-2.7.3/ root@hadoop003:/training/
4, Start hadoop
1. Execute on the primary node Hadoop 001
start-all.sh
Stop hadoop with
stop-all.sh
5, Verify installation
1. The viewing process of the master node includes:
NameNode ResourceMnager SecondaryNameNode
The processes viewed from the node are:
DataNode NodeManager
2. Browser view
HDFS:
http://hadoop001:50070
YARN:
http://hadoop001:8088