Experiment 2: familiar with common HDFS operation

1, Experimental purpose

  1. Understand the role of HDFS in Hadoop architecture;
  2. Proficient in using HDFS to operate common Shell commands;
  3. Familiar with Java API s commonly used in HDFS operation.

2, Experimental platform

  1. Operating system: Linux (CentOS recommended);
  2. Hadoop version: 2.6.1;
  3. JDK version: 1.7 or above;
  4. Java IDE: Eclipse.

3, Experimental steps

(1) The following functions are realized by programming, and the same tasks are completed by using the Shell command provided by Hadoop:

Outputting the contents of the specified file in HDFS to the terminal;

#!/bin/bash
hadoop fs -ls /
read -p "please select file you want to output: " filename
if hadoop fs -test -e /$filename
   then
   hadoop fs -cat /$filename
else
   echo "the file not exist, output failed"
fi

Java implementation:

Display the read-write permission, size, creation time, path and other information of the file specified in HDFS;

read -p "please select file you want to check: " filename
if hadoop fs -test -e /$filename
   then
   hadoop fs -ls -h /$filename
else
   echo "the file not exist, output failed"
fi

Java implementation:

Given a directory in HDFS, output the read-write permission, size, creation time, path and other information of all files in the directory. If the file is a directory, recursively output the relevant information of all files in the directory;

read -p "enter a path: " path
hadoop fs -ls -R $path

Java implementation:

Provide a path to a file in HDFS to create and delete the file. If the directory where the file is located does not exist, the directory will be created automatically;

#!/bin/bash
read -p "enter a path: " path
if $(hadoop fs -test -d $path);
then
   read -p "make choie (delete or append)" choice

   if [ $choice == "delete" ]
   then
      read -p "enter filename: " filename
      hadoop fs -rm -r {$path}/{$filename}
   else
      read -p "enter filename: " filename
      hadoop fs -touchz $path/$filename
   fi
else
   echo "path does not exist, created" 
   hadoop fs -mkdir -p $path
   read -p "make choie (delete or append)" choice
   if [ $choice == "delete" ]
   then
      read -p "enter filename: " filename
      hadoop fs -rm -r {$path}/{$filename}
   else
      read -p "enter filename: " filename
      hadoop fs -touchz $path/$filename
   fi
fi

Java implementation:

Provide a path to the directory of HDFS, and create and delete the directory. When creating a directory, if the directory where the directory file is located does not exist, the corresponding directory will be automatically created; When deleting a directory, the user specifies whether to delete the directory when the directory is not empty;

#!/bin/bash
read -p "enter a path: " path
if $(hadoop fs -test -d $path);
then
   read -p "make choie (delete or add):" choice
   if [ $choice == "delete" ]
   then
      read -p "enter deletefilename: " filename
      isEmpty=$(hadoop fs -count $path/$filename | awk '{print $2}')
      if [[ $isEmpty -eq 0 ]]
      then
         hadoop fs -rm -r $path/$filename
      else
         read -p "Not an empty directory,continue(yes or no): " choice2
         if [ choice2 == "yes" ]
         then
            hadoop fs -rm -r $path/$filename
         fi
      fi
   else
      read -p "enter creatfilename: " filename
      hadoop fs -touchz $path/$filename
   fi
else
   echo "path does not exist, created" 
   hadoop fs -mkdir -p $path
   read -p "make choie (delete or append)" choice
   if [ $choice == "delete" ]
   then
      read -p "enter filename: " filename
      hadoop fs -rm -r {$path}/{$filename}
   else
      read -p "enter filename: " filename
      hadoop fs -touchz $path/$filename
   fi
fi

Java implementation:

Add content to the file specified in HDFS, and the content specified by the user is added to the beginning or end of the original file;

#!/bin/bash
read -p "Enter the file path where you want to add content: " path
if hadoop fs -test -f $path
then
   read -p "make choie (head or end)" choice
   if [ $choice == "head" ]
   then
      read -p "enter content: " filename
      hadoop fs -copyFromLocal -f $filename $path
   else
      read -p "enter content: " filename
      hadoop fs -appendToFile $filename $path
   fi
else
   echo "file does not exist, created" 
   hadoop fs -mkdir -p $path/../
   hadoop fs -touchz $path
   read -p "make choie (head or end)" choice
   if [ $choice == "head" ]
   then
      read -p "enter content: " filename
      hadoop fs -copyFromLocal -f $filename $path
   else
      read -p "enter content: " filename
      hadoop fs -appendToFile $filename $path
   fi
fi

Java implementation:

Delete the file specified in HDFS;

read -p "enter a path you want to delete: " path
hadoop fs -rm -R $path

Java implementation:

In HDFS, move files from the source path to the destination path.

read -p "enter sourcepath: " sourcepath
read -p "enter target path: " targetpath        
hadoop fs -mv  $sourcepath $targetpath

Java implementation:

Optional:

(2) Program and implement a class "MyFSDataInputStream", which inherits "org.apache.hadoop.fs.FSDataInputStream". The requirements are as follows: implement the method "readLine()" to read the specified file in HDFS by line. If it reads the end of the file, it will return null, otherwise it will return the text of one line of the file.

(3) Check the Java help manual or other materials, complete the programming with "java.net.URL" and "org.apache.hadoop.fs.FsURLStreamHandlerFactory", and output the text of the specified file in HDFS to the terminal.

Java implementation:

  • Experimental summary and problems

1. Learn to use what to do;

2. What problems did you encounter during the experiment? How to solve it?

3. What problems remain unresolved? What may have caused it.

Tags: Eclipse Hadoop

Posted on Fri, 01 Oct 2021 17:23:39 -0400 by juxstapose