1, Hadoop distributed file system HDFS single data storage node
1. Install and deploy HadoopSpecific steps: https://blog.csdn.net/Hannah_zh/article/details/81169416
2. Modify the configuration file <1> Make the address of Namenode[root@server1 ~]# su - hadoop [hadoop@server1 ~]$ cd hadoop/etc/hadoop/ [hadoop@server1 hadoop]$ vim core-site.xml <configuration> <property> <name>fs.defaultFS</name> <value>hdfs://172.25.51.1:9000</value> </property> </configuration><2> Specify the Datanode address and the number of copies of data saved by hdfs
[hadoop@server1 hadoop]$ vim slaves 172.25.51.1 [hadoop@server1 hadoop]$ vim hdfs-site.xml <configuration> <property> <name>dfs.replication</name> <value>1</value> ##The number of copies of data saved by hdfs is 1 </property> </configuration>3. Set ssh password free login (premise: install ssh service)
[hadoop@server1 ~]$ ssh-keygen [hadoop@server1 ~]$ cd .ssh/ [hadoop@server1 .ssh]$ ls id_rsa id_rsa.pub [hadoop@server1 .ssh]$ cp id_rsa.pub authorized_keys
Figure: verify password free login
[hadoop@server1 ~]$ cd hadoop [hadoop@server1 hadoop]$ pwd /home/hadoop/hadoop [hadoop@server1 hadoop]$ bin/hdfs namenode -format [hadoop@server1 hadoop]$ ls /tmp/ hadoop-hadoop hsperfdata_hadoop
Illustration: files generated after formatting
[hadoop@server1 hadoop]$ sbin/start-dfs.sh6. Configure environment variables
Note: after configuring environment variables, you need to log in again to take effect
[hadoop@server1 ~]$ vim .bash_profile 10 PATH=$PATH:$HOME/bin:~/java/bin [hadoop@server1 ~]$ logout7.jps command to view java process
[root@server1 ~]# su - hadoop [hadoop@server1 ~]$ jps 2082 DataNode ##Data node 2239 SecondaryNameNode ##From metadata node 1989 NameNode ##Metadata node 2941 Jps
[hadoop@server1 hadoop]$ bin/hdfs dfs -mkdir /user [hadoop@server1 hadoop]$ bin/hdfs dfs -mkdir /user/hadoop [hadoop@server1 hadoop]$ bin/hdfs dfs -put input/ ##Upload to input [hadoop@server1 hadoop]$ bin/hdfs dfs -ls
[hadoop@server1 hadoop]$ bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.3.jar wordcount input output
Figure: the following results show that the operation is successful
2, Hadoop distributed file system HDFS multi data storage node
Experimental environment: RedHat version 6.5 Namenode: 1G memoryserver1: 172.25.51.1
Datanode: 1G memoryserver2: 172.25.51.2
server3: 172.25.51.3
[hadoop@server1 ~]$ cd hadoop [hadoop@server1 hadoop]$ rm -fr input/ output/ [hadoop@server1 hadoop]$ bin/hdfs dfs -get output [hadoop@server1 hadoop]$ rm -fr output/ [hadoop@server1 hadoop]$ sbin/stop-dfs.sh1.Namenode
[root@server1 ~]# yum install -y nfs-utils [root@server1 ~]# /etc/init.d/rpcbind start ##Before opening the nfs service. This service must be turned on [root@server1 ~]# vim /etc/exports /home/hadoop *(rw,anonuid=800,anongid=800) [root@server1 ~]# /etc/init.d/nfs start [root@server1 ~]# exportfs -v [root@server1 ~]# exportfs -rv2.Datanode (the operations of 172.25.51.3 and 172.25.51.2 are the same)
[root@server2 ~]# useradd -u 800 hadoop [root@server2 ~]# id hadoop uid=800(hadoop) gid=800(hadoop) groups=800(hadoop) [root@server2 ~]# yum install -y nfs-utils [root@server2 ~]# /etc/init.d/rpcbind start [root@server2 ~]# showmount -e 172.25.51.1 [root@server2 ~]# mount 172.25.51.1:/home/hadoop/ /home/hadoop/ [root@server2 ~]# df Filesystem 1K-blocks Used Available Use% Mounted on /dev/mapper/VolGroup-lv_root 19134332 925180 17237172 6% / tmpfs 510188 0 510188 0% /dev/shm /dev/vda1 495844 33451 436793 8% /boot 172.25.51.1:/home/hadoop/ 19134336 1962240 16200192 11% /home/hadoop3. configure hdfs
[root@server1 ~]# su - hadoop [hadoop@server1 ~]$ cd hadoop/etc/hadoop/ [hadoop@server1 hadoop]$ vim slaves ##Set Datanode 172.25.51.2 172.25.51.3 [hadoop@server1 hadoop]$ vim hdfs-site.xml 19 <configuration> 20 <property> 21 <name>dfs.replication</name> 22 <value>2</value> ##Duplicate two documents 23 </property> 24 </configuration> [hadoop@server1 hadoop]$ cd /tmp/ [hadoop@server1 tmp]$ rm -fr *4. Test ssh security free service
[hadoop@server1 tmp]$ ssh server2 [hadoop@server2 ~]$ logout [hadoop@server1 tmp]$ ssh server3 [hadoop@server3 ~]$ logout [hadoop@server1 tmp]$ ssh 172.25.120.2 [hadoop@server2 ~]$ logout [hadoop@server1 tmp]$ ssh 172.25.120.3 [hadoop@server2 ~]$ logout5. Reformat
[hadoop@server1 ~]$ cd hadoop [hadoop@server1 hadoop]$ bin/hdfs namenode -format [hadoop@server1 hadoop]$ ls /tmp/ hadoop-hadoop hsperfdata_hadoop6. start dfs
[hadoop@server1 hadoop]$ sbin/start-dfs.sh
Figure: jps viewing java process