Hadoop Command-line: Quick reference for Common Commands

Hadoop Command-line: Quick reference for Common Commands

Last updated:

These are a few commonly used Hadoop commands that

These steps work on Ubuntu, with a Hadoop installation similar to the one suggested by this hadoop installation tutorial

Start hadoop and yarn

$ start-all.sh

Stop hadoop and yarn

$ stop-all.sh

Compile a Java source file

For example, a file called WordCount.java

$ hadoop com.sun.tools.javac.Main WordCount.java

Create jar file

From the files generated from the compile step above

$ jar -cf MyFile.jar  WordCount*.class

Send a Jar file to be executed on Hadoop

The command goes like this: hadoop jar <jarfile> <mainclass> <parameter1> <parameter2> ...

$ hadoop jar MyFile.jar WordCount /wordcount/input /wordcount/output

Note that, if the main class is specified in a Manifest file, you can't specify it on the command line like this

Send a file to HDFS

First, create a directory

$ hadoop fs -mkdir /mydirectory

Then put a file in that directory

$ hadoop fs -put path/to/your/file1.txt /mydirectory/

You can see your file is on HDFS now (your output may be slightly different):

$ hadoop fs -ls /mydirectory
Found 1 items
-rw-r--r-- 1 vagrant vagrant 289 2015-09-07 21:39 /mydirectory/file1.txt