Upgrade Hadoop

This section contains instructions to Upgrade Hadoop from one version to another. This tutorial explain how to Upgrade Hadoop, which is running in distributed mode(ie on cluster) without loss of data. Before starting upgrade procedure please ensure that no job is running.

COMMAND DESCRIPTION
bin/stop-mapred.sh Stop map-reduce cluster(s) and all client

applications running on the DFS cluster
bin/stop-dfs.sh Stop DFS using the shutdown command.
Install new version of Hadoop (On all the nodes in the cluster)

bin/start-dfs.sh -upgrade Start DFS cluster with -upgrade option
bin/start-mapred.sh Start map-reduce cluster
bin/hadoop dfsadmin --finalizeUpgrade Verify the components run properly and
then finalize the upgrade when convinced
If you get any error visit Hadoop Troubleshooting

Running Hadoop in Pseudo Distributed Mode

This section contains instructions for Hadoop installation on ubuntu. This is Hadoop quickstart tutorial to setup Hadoop quickly. This is shortest tutorial of Hadoop installation, here you will get all the commands and their description required to install Hadoop in Pseudo distributed mode(single node cluster)


COMMAND DESCRIPTION
sudo apt-get install sun-java6-jdk Install java
If you don't have hadoop bundle download here download hadoop
sudo tar xzf file_name.tar.gz Extract hadoop bundle
Go to your hadoop installation directory(HADOOP_HOME)
vi conf/hadoop-env.sh Edit configuration file hadoop-env.sh and set JAVA_HOME:
export JAVA_HOME=path to be the root of your Java installation(eg: /usr/lib/jvm/java-6-sun)
vi conf/core-site.xml
then type:
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>
Edit configuration file core-site.xml
vi conf/hdfs-site.xml
then type:
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
Edit configuration file hdfs-site.xml
vi conf/mapred.xml
then type:
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>localhost:9001</value>
</property>
</configuration>
Edit configuration file mapred-site.xml and type:
sudo apt-get install openssh-server openssh-client install ssh
ssh-keygen -t rsa -P ""
cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys
ssh localhost
Setting passwordless ssh
bin/hadoop namenode –format Format the new distributed-filesystem
During this operation :
Name node get start
Name node get formatted
Name node get stopped
bin/start-all.sh Start the hadoop daemons
jps It should give output like this:
14799 NameNode
14977 SecondaryNameNode
15183 DataNode
15596 JobTracker
15897 TaskTracker
Congratulations Hadoop Setup is Completed
http://localhost:50070/ web based interface for name node
http://localhost:50030/ web based interface for job tracker
Now lets run some examples
bin/hadoop jar hadoop-*-examples.jar pi 10 100 run pi example
bin/hadoop dfs -mkdir input
bin/hadoop dfs -put conf input
bin/hadoop jar hadoop-*-examples.jar grep input output 'dfs[a-z.]+'
bin/hadoop dfs -cat output/*
run grep example
bin/hadoop dfs -mkdir inputwords
bin/hadoop dfs -put conf inputwords
bin/hadoop jar hadoop-*-examples.jar wordcount inputwords outputwords
bin/hadoop dfs -cat outputwords/*
run wordcount example
bin/stop-all.sh Stop the hadoop daemons

Running Hadoop in Standalone Mode

This section contains instructions for Hadoop installation on ubuntu. This is Hadoop quickstart tutorial to setup Hadoop quickly. This is shortest tutorial of Hadoop installation, here you will get all the commands and their description required to install Hadoop in Standalone mode(single node cluster)


COMMAND DESCRIPTION
sudo apt-get install sun-java6-jdk Install java
if you don't have hadoop bundle download here download hadoop
sudo tar xzf file_name.tar.gz Extract hadoop bundle
vi conf/hadoop-env.sh Edit configuration file hadoop-env.sh and set JAVA_HOME:
export JAVA_HOME=path to be the root of your Java installation(eg: /usr/lib/jvm/java-6-sun)
Go your hadoop installation directory(HADOOP_HOME) and type:
bin/hadoop
This will display the usage documentation for the hadoop
Congratulations Your Hadoop Setup is Completed. Now lets run some examples
bin/hadoop jar hadoop-*-examples.jar pi 10 100 Run pi example
mkdir input
cp conf/*.xml input
bin/hadoop jar hadoop-*-examples.jar grep input output 'dfs[a-z.]+'
cat output/*
Run grep example
mkdir inputwords
cp conf/*.xml inputwords
bin/hadoop jar hadoop-*-examples.jar wordcount inputwords outputwords
run word count example
If you got any error while running examples visit Hadoop Troubleshooting