This post will help users to learn important and useful Hadoop shell commands which are very close to Unix shell commands., using these commands user can perform different operations on hdfs. People who are familiar with Unix shell commands can easily get hold of it, people who are new to Unix or hadoop need not to worry, just follow this article to learn all the commands used in day to day basis and practice the same.
FS Shell
The FileSystem (FS) shell is invoked by bin/hadoop fs <args>. All the FS shell commands take path URIs as arguments. The URI format is scheme://autority/path. For HDFS the scheme is hdfs, and for the local filesystem the scheme is file. The scheme and authority are optional. If not specified, the default scheme specified in the configuration is used. An HDFS file or directory such as /parent/child can be specified as hdfs://namenodehost/parent/child or simply as /parent/child.
Administrator Commands:
fsck – /
Run a HDFS filesystem checking Utility
Example: hadoop fsck - /
balancer
Runs a cluster balancing utility. An administrator can simply press Ctrl-C to stop the re-balancing process.
Example: hadoop balancer
version
Prints the hadoop version configured on the machine
Example: hadoop version
Hadoop fs commands:
ls
For a file returns stat on the file with the following format:
filename <number of replicas> filesize modification_date modification_time permissions userid groupid For a directory it returns list of its direct children as in Unix. A directory is listed as: dirname <dir> modification_time modification_time permissions userid groupid
Usage: hadoop fs -ls <args>
Example: hadoop fs -ls
lsr
Recursive version of ls. Similar to Unix ls -R.
Usage: hadoop fs -lsr <args>
Example hadoop fs -lsr
mkdir
Takes path uris as an argument and creates directories. The behavior is much like unix mkdir -p creating parent directories along the path.
Usage: hadoop fs -mkdir <paths>
Example: hadoop fs -mkdir /user/hadoop/Aravindu
mv
Moves files from source to destination. This command allows multiple sources as well in which case the destination needs to be a directory. Moving files across file systems is not permitted.
Usage: hadoop fs -mv URI [URI …] <dest>
Example: hadoop fs -mv /user/hduser/Aravindu/Consolidated_Sheet.csv /user/hduser/sandela/
put
Copy single src, or multiple srcs from local file system to the destination filesystem. Also reads input from stdin and writes to destination filesystem.
Usage: hadoop fs -put <localsrc> … <dst>
Example: hadoop fs -put localfile /user/Hadoop/hadoopfile hadoop fs -put localfile1 localfile2 /user/hadoop/hadoopdir
rm
Delete files specified as args. Only deletes non empty directory and files. Refer to rmr for recursive deletes.
Usage: hadoop fs -rm URI [URI …]
Example: hadoop fs -rm /user/hduser/sandela/Consolidated_Sheet.csv
rmr
Recursive version of deleting.
Usage: hadoop fs -rmr URI [URI …]
Example: hadoop fs -rmr /user/hduser/sandela
cat
The cat command concatenates and display files, it works similar to Unix cat command:
Usage: hadoop fs -cat URI [URI …]
Example: hadoop fs -cat /user/hduser/Aravindu/Consolidated_sheet.csv
chgrp
Usage: hadoop fs -chgrp [-R] GROUP URI [URI …]
chmod
Change the permissions of files. With -R, make the change recursively through the directory structure. The user must be the owner of the file, or else a super-user.
Usage: hadoop fs -chmod [-R] <MODE[,MODE]… | OCTALMODE> URI [URI …]
chown
Change the owner of files. With -R, make the change recursively through the directory structure. The user must be a super-user.
Usage: hadoop fs -chown [-R] [OWNER][:[GROUP]] URI [URI ]
copyFromLocal
Copies file form local machine and paste in given hadoop directory
Usage: hadoop fs -copyFromLocal <localsrc> URI
Example: hadoop fs –copyFromLocal /home/hduser/contact_details/output/job_titles/11-26-2013part-r-00000 /user/hduser/finaloutput_112613/
Similar to put command.
copyToLocal
Coppies file from hadoop directory and paste the file in local direcotry
Usage: hadoop fs -copyToLocal [-ignorecrc] [-crc] URI <localdst>
Example: hadoop fs –copyToLocal /user/hduser/output/Expecting_result_set_112613/part-r-00000 /home/hduser/contact_wordcount/11-26-2013
Similar to get command.
cp
Copy files from source to destination. This command allows multiple sources as well in which case the destination must be a directory.
Usage: hadoop fs -cp URI [URI …] <dest>
Example:hadoop fs -cp /user/hduser/input/Consolodated_Sheet.csv /user/hduser/Aravindu hadoop fs -cp /user/hadoop/file1 /user/hadoop/file2 /user/hadoop/dir
du
Displays aggregate length of files contained in the directory or the length of a file in case it’s just a file.
Usage: hadoop fs -du URI [URI …]
Example: hadoop fs -du /user/hduser/Aravindu/Consolidated_Sheet.csv
dus
Displays a summary of file lengths.
Usage: hadoop fs -dus <args>
Example: hadoop fs –dus /user/hduser/Aravindu/Consolidated_Sheet.csv
count:
Count the number of directories, files and bytes under the paths that match the specified file pattern
Example: hadoop fs –count hdfs:/
expunge
Empty the Trash.
Usage: hadoop fs -expunge
Example: hadoop fs –expunge
setrep
Changes the replication factor of a file. -R option is for recursively increasing the replication factor of files within a directory.
Usage: hadoop fs -setrep [-R] <path>
Example: hadoop fs -setrep -w 3 -R /user/hadoop/Aravindu
stat
Returns the stat information on the path.
Usage: hadoop fs -stat URI [URI …]
Example: hadoop fs -stat /user/hduser/Aravindu
tail
Displays last kilobyte of the file to stdout. -f option can be used as in Unix.
Usage: hadoop fs -tail [-f] URI
Example: hadoop fs -tail /user/hduser/Aravindu/Info
text
Takes a source file and outputs the file in text format. The allowed formats are zip and TextRecordInputStream.
Usage: hadoop fs -text <src>
Example: hadoop fs –text /user/hduser/Aravindu/Info
touchz
Create a file of zero length.
Usage: hadoop fs -touchz URI [URI …]
Example: hadoop -touchz /user/hduser/Aravind/emptyfile