Tech Orbit: Running Hadoop in Pseudo Distributed Mode

Running Hadoop in Pseudo Distributed Mode

This section contains instructions for Hadoop installation on ubuntu. This is Hadoop quickstart tutorial to setup Hadoop quickly. This is shortest tutorial of Hadoop installation, here you will get all the commands and their description required to install Hadoop in Pseudo distributed mode(single node cluster)

COMMAND	DESCRIPTION
sudo apt-get install sun-java6-jdk	Install java
	If you don't have hadoop bundle download here download hadoop
sudo tar xzf file_name.tar.gz	Extract hadoop bundle
Go to your hadoop installation directory(HADOOP_HOME)
vi conf/hadoop-env.sh	Edit configuration file hadoop-env.sh and set JAVA_HOME: export JAVA_HOME=path to be the root of your Java installation(eg: /usr/lib/jvm/java-6-sun)
vi conf/core-site.xml then type: <configuration> <property> <name>fs.default.name</name> <value>hdfs://localhost:9000</value> </property> </configuration>	Edit configuration file core-site.xml
vi conf/hdfs-site.xml then type: <configuration> <property> <name>dfs.replication</name> <value>1</value> </property> </configuration>	Edit configuration file hdfs-site.xml
vi conf/mapred.xml then type: <configuration> <property> <name>mapred.job.tracker</name> <value>localhost:9001</value> </property> </configuration>	Edit configuration file mapred-site.xml and type:
sudo apt-get install openssh-server openssh-client	install ssh
ssh-keygen -t rsa -P "" cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys ssh localhost	Setting passwordless ssh
bin/hadoop namenode –format	Format the new distributed-filesystem During this operation : Name node get start Name node get formatted Name node get stopped
bin/start-all.sh	Start the hadoop daemons
jps	It should give output like this: 14799 NameNode 14977 SecondaryNameNode 15183 DataNode 15596 JobTracker 15897 TaskTracker
Congratulations Hadoop Setup is Completed
http://localhost:50070/	web based interface for name node
http://localhost:50030/	web based interface for job tracker
Now lets run some examples
bin/hadoop jar hadoop-*-examples.jar pi 10 100	run pi example
bin/hadoop dfs -mkdir input bin/hadoop dfs -put conf input bin/hadoop jar hadoop--examples.jar grep input output 'dfs[a-z.]+' bin/hadoop dfs -cat output/	run grep example
bin/hadoop dfs -mkdir inputwords bin/hadoop dfs -put conf inputwords bin/hadoop jar hadoop--examples.jar wordcount inputwords outputwords bin/hadoop dfs -cat outputwords/	run wordcount example

bin/stop-all.sh	Stop the hadoop daemons

89 comments:

SayaliFebruary 15, 2011 at 8:27 PM
Hi,
I am trying to run hadoop in pseudo distributed
mode using cloudera as vm.
I have copied the files into hdfs via hue and using the job browser I am trying to run it as a job.
But it dies saying permissions denied error.
Could you help me the same?
Thanks,
Sayali
ReplyDelete
Replies
RahulFebruary 15, 2011 at 9:13 PM
Hi Sayali,
When we run jobs some files are created,
I think you have not given permission to create a file at that location(hadoop.tmp.dir).
Please provide all the permissions to the current user.
or you can also install as root user
For cloudera you can refer:
http://cloudera-tutorial.blogspot.com/
For Hue you can refer:
http://hivetutorial.wordpress.com/
ReplyDelete
Replies
SreeharshaFebruary 20, 2011 at 9:26 PM
Hi, I am getting following errors. Can you please help

11/02/21 00:23:23 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9001. Already tried 0 time(s).
11/02/21 00:23:24 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9001. Already tried 1 time(s).
11/02/21 00:23:25 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9001. Already tried 2 time(s).
11/02/21 00:23:26 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9001. Already tried 3 time(s).
ReplyDelete
Replies
RahulFebruary 20, 2011 at 10:10 PM
Hi,
Please run jps command to ensure all the daemons are running.
Also if you run job just after bin/start-all.sh, you may get this error because namenode is in safe mode, so wait for 10 - 30 sec and try to run same job again
ReplyDelete
Replies
PramatheshApril 6, 2011 at 12:08 AM
We are trying to set up a database in hive. Every time we create tables, insert data and end our session by stopping all hadoop daemons, we lose previously inserted data in hive as we need to execute hadoop namenode -format every time that seems to wipe out all data. Is there any way to retain the data created in hive?
ReplyDelete
Replies
RahulApril 6, 2011 at 12:15 AM
You need not to execute "hadoop namenode -format" every time you start the cluster.
just execute start-all.sh to start hadoop daemons.
If we format the namenode all the data will be lost
ReplyDelete
Replies
PramatheshApril 6, 2011 at 1:35 AM
Thanks Rahul for your prompt reply. What you said is true only when we keep ubuntu running. But when we restart our machine, we had to format namenode server otherwise http://localhost:50070/ won't work. So, maybe we need to start namenode in some other manner, right?
ReplyDelete
Replies
RahulApril 6, 2011 at 2:19 AM
when you restart your machine, just restart your hadoop daemons by executing "start-all.sh"
After starting daemons http://localhost:50070/ will work
ReplyDelete
Replies
vedabApril 24, 2011 at 11:15 PM
hi can u tell me how to install Hadoop tutorial for running first in pseudo-distributed mode in windows7
ReplyDelete
Replies
RahulApril 25, 2011 at 7:16 AM
Hi,
I would suggest you to install hadoop on linux.
If you want to install on windows:
first install cygwin from http://www.cygwin.com/ , it will provide linux atmosphere,
now you can follow above tutorial for hadoop installation.
ReplyDelete
Replies
AnonymousNovember 9, 2011 at 1:39 AM
HI, I get following errors, when i want to run wordcount example, and many other examples provided by hadoop itself, can you help me?

11/02/21 00:23:23 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9001. Already tried 0 time(s).
11/02/21 00:23:24 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9001. Already tried 1 time(s).
11/02/21 00:23:25 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9001. Already tried 2 time(s).
11/02/21 00:23:26 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9001. Already tried 3 time(s).
ReplyDelete
Replies
RahulNovember 9, 2011 at 2:39 AM
Please ensure all the daemons of hadoop are running By running jps command.
also ensure that name node is not in safe mode

Please see logs in case of any error
ReplyDelete
Replies
waqasNovember 17, 2011 at 1:48 AM
Hi, when I run the command bin/start-all.sh I get an error. Can you please help me to sort it out. here is the error message

localhost: starting secondarynamenode, logging to /data/hadoop/hadoop-0.20.2/bin/../logs/hadoop-waqas-secondarynamenode-trinity.out

localhost: Exception in thread "main" java.lang.NumberFormatException: For input string: "localhost:9000"

localhost: at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)

localhost: at java.lang.Integer.parseInt(Integer.java:492)

localhost: at java.lang.Integer.parseInt(Integer.java:527)

localhost: at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:146)

localhost: at org.apache.hadoop.hdfs.server.namenode.NameNode.getAddress(NameNode.java:156)

localhost: at org.apache.hadoop.hdfs.server.namenode.NameNode.getAddress(NameNode.java:160)

localhost: at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.initialize(SecondaryNameNode.java:131)

localhost: at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.(SecondaryNameNode.java:115)

localhost: at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.main(SecondaryNameNode.java:469)
ReplyDelete
Replies
RahulNovember 17, 2011 at 2:58 AM
There might be mistake in your configuration files
like core-site.xml, hdfs-site.xml, mapred-site.xml etc.
please post your configuration files contents
ReplyDelete
Replies
waqasNovember 17, 2011 at 3:06 AM
I edited my configuration files exactly the same as you suggested up here in this blog.
ReplyDelete
Replies
RahulNovember 17, 2011 at 3:08 AM
please posts you configuration files so i can help u on this
ReplyDelete
Replies
waqasNovember 17, 2011 at 3:13 AM
ohh..yeah I rechecked and there was some error in my configuration files.Thanks for your time and suggestion
ReplyDelete
Replies
BhaveshNovember 30, 2011 at 3:35 AM
Hi,
I have configured Hadoop and Hive on Windows through Cygwin. But I am facing some problems like: in hive terminal (CLI): hive> when i enter query, the query do not execute and terminal remains busy.

If i enter the query like: bin/hive -e 'LOAD DATA INPATH 'kv1.txt' OVERWRITE INTO TABLE pokes;'

The Output is like this:
Hive history file=/tmp/Bhavesh.Shah/hive_job_log_Bhavesh.Shah_201111301549_1377455380.txt FAILED: Parse Error: line 1:17 mismatched input 'kv1' expecting StringLiteral near 'INPATH' in load statement

What could be the problem? Pls suggest me
ReplyDelete
Replies
RahulNovember 30, 2011 at 3:44 AM
You need to create file "kv1.txt" and provide path to that.

If error persist please post contents of log file (history file=/tmp/Bhavesh.Shah/hive_job_log_Bhavesh.Shah......)
ReplyDelete
Replies
BhaveshNovember 30, 2011 at 5:03 AM
This comment has been removed by the author.
ReplyDelete
Replies
BhaveshNovember 30, 2011 at 5:08 AM
I have already put that file in the same directory thats why I have written kv1.txt.
The error String Lateral actually creating the problem, I dont no know why?

And I one more thing is that I am not finding that particular directory i.e. /tmp/Bhavesh.Shah/...

Now what to do?......:(
ReplyDelete
Replies
BhaveshNovember 30, 2011 at 5:17 AM
hi,
I just found the error log file.
CONTENT IS:
---------
SessionStart SESSION_ID="Bhavesh.Shah_201111301549" TIME="1322648344557"

Sorry for the multiple posts.
ReplyDelete
Replies
RahulDecember 2, 2011 at 12:24 AM
change outer single quotes to double quotes
also put local keyword
the correct query would be:

bin/hive -e "LOAD DATA LOCAL INPATH 'kv1.txt' OVERWRITE INTO TABLE pokes;"

following link would be useful:
http://hivebasic.blogspot.com/
ReplyDelete
Replies
BhaveshDecember 2, 2011 at 12:55 AM
I have one more doubt that,
When I enter the query in Hive CLI, I get the error as:

$ bin/hive -e "insert overwrite table pokes select a.* from invites a where a.ds='2008-08-15';"
bin/hive -e "insert overwrite table pokes select a.* from invites a where a.ds='2008-08-15';"
Hive history file=/tmp/Bhavesh.Shah/hive_job_log_Bhavesh.Shah_201112021007_2120318983.txt
Total MapReduce jobs = 2
Launching Job 1 out of 2
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_201112011620_0004, Tracking URL = http://localhost:50030/jobdetails.jsp?jobid=job_201112011620_0004
Kill Command = C:\cygwin\home\Bhavesh.Shah\hadoop-0.20.2\/bin/hadoop job -Dmapred.job.tracker=localhost:9101 -kill job_201112011620_0004
2011-12-02 10:07:30,777 Stage-1 map = 0%, reduce = 0%
2011-12-02 10:07:57,796 Stage-1 map = 100%, reduce = 100%
Ended Job = job_201112011620_0004 with errors
FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask

I think that map-reduce job is not started and hence it is not executed?
So what could be solution?
Thanks.
ReplyDelete
Replies
RahulDecember 2, 2011 at 1:14 AM
Please scan your error logs and post detailed exception
ReplyDelete
Replies
BhaveshDecember 2, 2011 at 1:19 AM
2011-12-02 12:29:19,275 ERROR DataNucleus.Plugin (Log4JLogger.java:error(115)) - Bundle "org.eclipse.jdt.core" requires "org.eclipse.core.resources" but it cannot be resolved.
2011-12-02 12:29:19,275 ERROR DataNucleus.Plugin (Log4JLogger.java:error(115)) - Bundle "org.eclipse.jdt.core" requires "org.eclipse.core.resources" but it cannot be resolved.
2011-12-02 12:29:19,275 ERROR DataNucleus.Plugin (Log4JLogger.java:error(115)) - Bundle "org.eclipse.jdt.core" requires "org.eclipse.core.runtime" but it cannot be resolved.
2011-12-02 12:29:19,275 ERROR DataNucleus.Plugin (Log4JLogger.java:error(115)) - Bundle "org.eclipse.jdt.core" requires "org.eclipse.core.runtime" but it cannot be resolved.
2011-12-02 12:29:19,275 ERROR DataNucleus.Plugin (Log4JLogger.java:error(115)) - Bundle "org.eclipse.jdt.core" requires "org.eclipse.text" but it cannot be resolved.
2011-12-02 12:29:19,275 ERROR DataNucleus.Plugin (Log4JLogger.java:error(115)) - Bundle "org.eclipse.jdt.core" requires "org.eclipse.text" but it cannot be resolved.
2011-12-02 12:29:23,011 WARN mapred.JobClient (JobClient.java:configureCommandLineOptions(539)) - Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
2011-12-02 12:29:58,749 ERROR exec.MapRedTask (SessionState.java:printError(343)) - Ended Job = job_201112011620_0006 with errors
2011-12-02 12:29:58,858 ERROR ql.Driver (SessionState.java:printError(343)) - FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask
ReplyDelete
Replies
BhaveshDecember 5, 2011 at 1:50 AM
Hello Rahul,
Now this time I have configured Hadoop in Linux (Ubuntu) and trying for Hive. But there is one problem while configuring Hive that:

After successfully building the package through ant, when I try for the launching Hive CLI from Hive directory. I am getting errors as:
"Missing Hive Builtins Jar: /home/hadoop/hive-0.7.1/hive/lib/hive-builtins-*.jar"

What could be the problem in configuration? Pls suggest me as soon as possible.
ReplyDelete
Replies
RahulDecember 5, 2011 at 2:50 AM
For configuring Hive you don't need to build it
I didn't build it

you can follow this approach
1. Install hadoop
2. set HADOOP_HOME
3. Untar Hive*.tar.gz
4. go to HIVE_HOME and type bin/hive
hive shell should be open
ReplyDelete
Replies
Kiran ZendeApril 16, 2012 at 4:45 AM
i am getting error :

hdfs://127.0.0.1:9100/tmp/hadoop-DEFTeam-N5/mapred/system/jobtracker.info is missing!
...my name node is running, however Jobtracker is not running
ReplyDelete
Replies
Kiran ZendeApril 16, 2012 at 4:48 AM
when I searched in /tmp folder, there is no directory called Mapred/system/jobtracker.info
ReplyDelete
Replies
Kiran ZendeApril 16, 2012 at 6:21 AM
INFO org.apache.hadoop.mapred.JobTracker: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting JobTracker
STARTUP_MSG: host = DEFTeam-N5-PC/192.168.2.104
STARTUP_MSG: args = []
STARTUP_MSG: version = 0.20.0
STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/core/branches/branch-0.20 -r 763504; compiled by 'ndaley' on Thu Apr 9 05:18:40 UTC 2009
************************************************************/
INFO org.apache.hadoop.ipc.metrics.RpcMetrics: Initializing RPC Metrics with hostName=JobTracker, port=9101
INFO org.mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog
INFO org.apache.hadoop.http.HttpServer: Jetty bound to port 50030
INFO org.mortbay.log: jetty-6.1.14
INFO org.mortbay.log: Started SelectChannelConnector@0.0.0.0:50030
INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId=
INFO org.apache.hadoop.mapred.JobTracker: JobTracker up at: 9101
INFO org.apache.hadoop.mapred.JobTracker: JobTracker webserver: 50030
INFO org.apache.hadoop.mapred.JobTracker: Cleaning up the system directory
WARN org.apache.hadoop.mapred.JobTracker: Failed to initialize recovery manager. The Recovery manager failed to access the system files in the system dir (hdfs://127.0.0.1:9100/tmp/hadoop-DEFTeam-N5/mapred/system).
WARN org.apache.hadoop.mapred.JobTracker: It might be because the JobTracker failed to read/write system files (hdfs://127.0.0.1:9100/tmp/hadoop-DEFTeam-N5/mapred/system/jobtracker.info / hdfs://127.0.0.1:9100/tmp/hadoop-DEFTeam-N5/mapred/system/jobtracker.info.recover) or the system file hdfs://127.0.0.1:9100/tmp/hadoop-DEFTeam-N5/mapred/system/jobtracker.info is missing!
WARN org.apache.hadoop.mapred.JobTracker: Bailing out...
WARN org.apache.hadoop.mapred.JobTracker: Error starting tracker: org.apache.hadoop.ipc.RemoteException: java.io.IOException: failed to create file /tmp/hadoop-DEFTeam-N5/mapred/system/jobtracker.info on client 127.0.0.1.
Requested replication 0 is less than the required minimum 1

at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:238)
at org.apache.hadoop.mapred.JobTracker$RecoveryManager.updateRestartCount(JobTracker.java:1168)
at org.apache.hadoop.mapred.JobTracker.(JobTracker.java:1657)
at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:174)
at org.apache.hadoop.mapred.JobTracker.main(JobTracker.java:3528)

2012-04-16 18:48:39,832 FATAL org.apache.hadoop.mapred.JobTracker: java.net.BindException: Problem binding to /127.0.0.1:9101 : Address already in use: bind
ReplyDelete
Replies
NathanApril 18, 2012 at 1:01 AM
Hi

When I am running jps command in cygwin and it says "command not found". Any idea how to run jps command in cygwin?

Thanks,
Nathan
ReplyDelete
Replies
Kiran ZendeApril 19, 2012 at 10:24 PM
Hi,
Few days back, I installed Hadoop 0.20.2 on window's 7 through Cygwin...and its working fine, I ran Wordcount example on command prompt, everything is working fine.

I would like to know about Cloudera, I am going through cloudera videos, is there any difference in Cloudera installation and installation of hadoop through Cygwin. If I want to learn Cloudera hadoop in detail, Shall I do setup of hadoop through cloudera? can we install on windows?.....plz help me
ReplyDelete
Replies
Kiran ZendeApril 20, 2012 at 12:00 AM
Thanks for information,
I am going to use Pentaho BI Tool with apache hadoop , i found document on pentaho which is saying to create virtual Operating system, installing VMware player and then Ubuntu, for hadoop installation...... will it be useful?
ReplyDelete
Replies
ElieMay 7, 2012 at 1:12 PM
Hello. I am a new comer to Hadoop. I followed the instructions on http://alans.se/blog/2010/hadoop-hbase-cygwin-windows-7-x64/. When I run the test
"bin/hadoop jar hadoop-*examples*.jar grep input output 'dfs[a-z.]+'"
I get an exception following a set of errors
"Retrying connect to server: /127.0.0.1:9101. Already tried"

I checked that Safe Mode is off. Any thoughts on what I can try to make sure I can get this working?
ReplyDelete
Replies
Kiran ZendeMay 11, 2012 at 2:33 AM
Hi Rahul, can we have common storage for both HBase and Hive, I want to retrieve data from both Hbase and Hive.....so I would like to know whether I can make common storage for data coming from both Hbase and Hive...plz help
ReplyDelete
Replies
waqasMay 14, 2012 at 5:26 AM
Hi Rahul,
I am trying to confidgure hadoop 1.0. When I run the command for pi example then I get the following error:

Number of Maps = 10
Samples per Map = 100
java.lang.RuntimeException: java.io.IOException: Call to localhost/127.0.0.1:9000 failed on local exception: java.io.EOFException
at org.apache.hadoop.mapred.JobConf.getWorkingDirectory(JobConf.java:546)
at org.apache.hadoop.mapred.FileInputFormat.setInputPaths(FileInputFormat.java:318)
at org.apache.hadoop.examples.PiEstimator.estimate(PiEstimator.java:265)
at org.apache.hadoop.examples.PiEstimator.run(PiEstimator.java:342)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.hadoop.examples.PiEstimator.main(PiEstimator.java:351)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:64)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
Caused by: java.io.IOException: Call to localhost/127.0.0.1:9000 failed on local exception: java.io.EOFException
at org.apache.hadoop.ipc.Client.wrapException(Client.java:1103)
at org.apache.hadoop.ipc.Client.call(Client.java:1071)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
at $Proxy1.getProtocolVersion(Unknown Source)
at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:396)
at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:379)
at org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:119)
at org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:238)
at org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:203)
at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:89)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1386)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1404)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:123)
at org.apache.hadoop.mapred.JobConf.getWorkingDirectory(JobConf.java:542)
... 17 more
Caused by: java.io.EOFException
at java.io.DataInputStream.readInt(DataInputStream.java:392)
at org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:800)
at org.apache.hadoop.ipc.Client$Connection.run(Client.java:745)

What it might be the problem?Thanks

waqas
ReplyDelete
Replies
waqasMay 14, 2012 at 7:26 AM
I got rid of this problem and now I am getting this error at pi example

error message part1(due to limit of 4096 words)

Number of Maps = 10
Samples per Map = 100
12/05/14 16:21:47 WARN hdfs.DFSClient: DataStreamer Exception: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /user/waqas/PiEstimator_TMP_3_141592654/in/part0 could only be replicated to 0 nodes, instead of 1
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1556)
at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:696)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:563)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1388)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1384)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1083)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1382)

at org.apache.hadoop.ipc.Client.call(Client.java:1066)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
at $Proxy1.addBlock(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
at $Proxy1.addBlock(Unknown Source)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:3507)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:3370)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2700(DFSClient.java:2586)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2826)
ReplyDelete
Replies
Kiran ZendeMay 22, 2012 at 10:58 PM
Hi Rahul, I am using hadoop 0.20.2, and it's working fine, however sometimes, namenode stops working, then to resolve the issue, I need to delete hadoop image stored in tmp directory and have to do namenode format step, it resolve the issue but all the data is lost. is there anyother way to resolve issue?
ReplyDelete
Replies
Kiran ZendeMay 25, 2012 at 7:41 AM
There is no error log, I did reformat now working fine...thnx
ReplyDelete
Replies
Kiran ZendeMay 25, 2012 at 7:41 AM
I have 1 more doubt :

I want to retrieve data from Hbase and Hive, my dimension table's are stored in Hbase and fact tables in Hive, so using Hive how to integrate and retrieve data from hbase, I am using Hive version 0.9.0, Hbase version 0.92.0, I heard that hive 0.9.0 onwards we can retrieve existing data from Hbase, but i dnt knw how to, plz help
ReplyDelete
Replies
Kiran ZendeMay 28, 2012 at 1:19 AM
My Issue is: Taking simple example , if Dimension table is in Hbase "eg(prodid,prodname,date)" and fact table is in Hive "eg: (prodid,sales)" , then I would like to know how to do integration if I wanna print O/P:(Prodname,sales,date)..

The link provided "https://cwiki.apache.org/Hive/hbaseintegration" says that hbase table is to be created through hive....However in my case Hbase table is not created through hive......I am using hive:0.9,Hbase:0.92.1 versions ...plz help
ReplyDelete
Replies
Kiran ZendeMay 29, 2012 at 9:20 PM
Thnx for help...
ReplyDelete
Replies
newJuly 12, 2012 at 4:04 AM
Hi rahul, I have installed hadoop-0.20 using cygwin in my windows7, and namenode,secondarynamenode,datanode,jobtracker and tasktraker are working fine.
I have set configuration in Eclipse IDE, and running wordcount example.it is also working fine.
the problem is, if I stop all the daemons by stop-all.sh and running my wordcount example..
it is running without any error, and producing the output file...
I dont know how it is working....?? any ideas...please
thanks,
Nitai
ReplyDelete
Replies
UnknownSeptember 17, 2012 at 10:31 PM
Hi Rahul I am new to hadoop and I made my first multicluster but hte problem I am facing is that everything worked except jps command .Its showing error -bash jps :command not found . please tell me where I am going wrong I am using CentOS 6
ReplyDelete
Replies
UnknownNovember 22, 2012 at 8:15 PM
hi i have change Java_home in hadoop-env.sh file to usr/lib/jvm/java-6-openjdk
but on terminal it show error java_home is not set.
what should i do?
ReplyDelete
Replies
UnknownNovember 23, 2012 at 3:03 AM
hi
i have download hadoop 1.0.4 and extract it in a folder
how it has to install?...
ReplyDelete
Replies
UnknownNovember 23, 2012 at 3:04 AM
how to go to hadoop installation directory(HADOOP_HOME)??
ReplyDelete
Replies
UnknownNovember 24, 2012 at 3:18 AM
This comment has been removed by the author.
ReplyDelete
Replies
UnknownNovember 24, 2012 at 3:43 AM
hi
when i run bin/hadoop jar hadoop-*-examples.jar pi 10 100 it give an error
cannot unzip the zip file...please help
ReplyDelete
Replies
maverickDecember 4, 2012 at 2:22 AM
When executing the start-all some of the daemons are not started:
for data node i see this backtrace:

Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
C [ld-linux-x86-64.so.2+0x868f] double+0xcf
C [ld-linux-x86-64.so.2+0xa028] _dl_relocate_object+0x588
C [ld-linux-x86-64.so.2+0x102d5] double+0x3d5
C [ld-linux-x86-64.so.2+0xc1f6] _dl_catch_error+0x66
C [libdl.so.2+0x11fa] double+0x6a

Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
j java.lang.ClassLoader$NativeLibrary.load(Ljava/lang/String;)V+0
j java.lang.ClassLoader.loadLibrary0(Ljava/lang/Class;Ljava/io/File;)Z+300
j java.lang.ClassLoader.loadLibrary(Ljava/lang/Class;Ljava/lang/String;Z)V+347
j java.lang.Runtime.loadLibrary0(Ljava/lang/Class;Ljava/lang/String;)V+54
j java.lang.System.loadLibrary(Ljava/lang/String;)V+7
j org.apache.hadoop.util.NativeCodeLoader.()V+25
v ~StubRoutines::call_stub
j org.apache.hadoop.io.nativeio.NativeIO.()V+13
v ~StubRoutines::call_stub
j org.apache.hadoop.fs.FileUtil.setPermission(Ljava/io/File;Lorg/apache/hadoop/fs/permission/FsPermission;)V+22
j org.apache.hadoop.fs.RawLocalFileSystem.setPermission(Lorg/apache/hadoop/fs/Path;Lorg/apache/hadoop/fs/permission/FsPermission;)V+6
j org.apache.hadoop.fs.FilterFileSystem.setPermission(Lorg/apache/hadoop/fs/Path;Lorg/apache/hadoop/fs/permission/FsPermission;)V+6
j org.apache.hadoop.util.DiskChecker.mkdirsWithExistsAndPermissionCheck(Lorg/apache/hadoop/fs/LocalFileSystem;Lorg/apache/hadoop/fs/Path;Lorg/apache/hadoop/fs/permission/FsPermission;)Z+40
j org.apache.hadoop.util.DiskChecker.checkDir(Lorg/apache/hadoop/fs/LocalFileSystem;Lorg/apache/hadoop/fs/Path;Lorg/apache/hadoop/fs/permission/FsPermission;)V+3
j org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance([Ljava/lang/String;Lorg/apache/hadoop/conf/Configuration;Lorg/apache/hadoop/hdfs/server/datanode/SecureDataNodeStarter$SecureResources;)Lorg/apache/hadoop/hdfs/server/datanode/DataNode;+74
j org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode([Ljava/lang/String;Lorg/apache/hadoop/conf/Configuration;Lorg/apache/hadoop/hdfs/server/datanode/SecureDataNodeStarter$SecureResources;)Lorg/apache/hadoop/hdfs/server/datanode/DataNode;+99
j org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode([Ljava/lang/String;Lorg/apache/hadoop/conf/Configuration;Lorg/apache/hadoop/hdfs/server/datanode/SecureDataNodeStarter$SecureResources;)Lorg/apache/hadoop/hdfs/server/datanode/DataNode;+3

Any idea on this ??
ReplyDelete
Replies
SagarDecember 11, 2012 at 10:25 AM
I am trying to connect from client machine to hive server.I installed hive and hadoop on the client.I am able to run hive.I have copied the hive-site.xml from server.But whenever I run any query it gives me this error..:

#
# A fatal error has been detected by the Java Runtime Environment:
#
# SIGFPE (0x8) at pc=0x00002aaaaaab368f, pid=25697, tid=1076017472
#
# JRE version: 6.0_31-b04
# Java VM: Java HotSpot(TM) 64-Bit Server VM (20.6-b01 mixed mode linux-amd64 compressed oops)
# Problematic frame:
# C [ld-linux-x86-64.so.2+0x868f] double+0xcf
#
# An error report file with more information is saved as:
# /usr/local/hcat/hs_err_pid25697.log
#
# If you would like to submit a bug report, please visit:
# http://java.sun.com/webapps/bugreport/crash.jsp
# The crash happened outside the Java Virtual Machine in native code.
# See problematic frame for where to report the bug.
#
ReplyDelete
Replies
buffalo37January 7, 2013 at 9:55 PM
I have problem when I run the word count example. I have check /etc/hosts and config files (core-site.xml, hdfs-site.xml and mapred-site.xml. Could you please check it for me?

hadoop@Hadoop hadoop]$ bin/hadoop jar hadoop-examples-1.1.1.jar wordcount input output
Warning: $HADOOP_HOME is deprecated.

13/01/06 13:27:18 INFO ipc.Client: Retrying connect to server: Hadoop/10.57.250.186:6868. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
13/01/06 13:27:19 INFO ipc.Client: Retrying connect to server: Hadoop/10.57.250.186:6868. Already tried 1 time(s); retry policy is
13/01/06 13:27:22 INFO ipc.Client: Retrying connect to server: Hadoop/10.57.250.186:6868. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
13/01/06 13:27:23 INFO ipc.Client: Retrying connect to server: Hadoop/10.57.250.186:6868. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
13/01/06 13:27:24 INFO ipc.Client: Retrying connect to server: Hadoop/10.57.250.186:6868. Already tried 6 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
13/01/06 13:27:25 INFO ipc.Client: Retrying connect to server: Hadoop/10.57.250.186:6868. Already tried 7 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
13/01/06 13:27:26 INFO ipc.Client: Retrying connect to server: Hadoop/10.57.250.186:6868. Already tried 8 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
13/01/06 13:27:27 INFO ipc.Client: Retrying connect to server: Hadoop/10.57.250.186:6868. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
13/01/06 13:27:27 ERROR security.UserGroupInformation: PriviledgedActionException as:hadoop cause:java.net.ConnectException: Call to Hadoop/10.57.250.186:6868 failed on connection exception: java.net.ConnectException: Connection refused
java.net.ConnectException: Call to Hadoop/10.57.250.186:6868 failed on connection exception: java.net.ConnectException: Connection refused
at org.apache.hadoop.ipc.Client.wrapException(Client.java:1136)
at org.apache.hadoop.ipc.Client.call(Client.java:1112)
...
ReplyDelete
Replies