<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=us-ascii">
<meta name=Generator content="Microsoft Word 12 (filtered medium)">
<style>
<!--
/* Font Definitions */
@font-face
        {font-family:"Cambria Math";
        panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
        {font-family:Calibri;
        panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
        {font-family:"Malgun Gothic";
        panose-1:0 0 0 0 0 0 0 0 0 0;}
@font-face
        {font-family:"\@Malgun Gothic";}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
        {margin:0cm;
        margin-bottom:.0001pt;
        font-size:11.0pt;
        font-family:"Calibri","sans-serif";}
a:link, span.MsoHyperlink
        {mso-style-priority:99;
        color:blue;
        text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
        {mso-style-priority:99;
        color:purple;
        text-decoration:underline;}
p.MsoListParagraph, li.MsoListParagraph, div.MsoListParagraph
        {mso-style-priority:34;
        margin-top:0cm;
        margin-right:0cm;
        margin-bottom:0cm;
        margin-left:36.0pt;
        margin-bottom:.0001pt;
        font-size:11.0pt;
        font-family:"Calibri","sans-serif";}
span.EmailStyle17
        {mso-style-type:personal-compose;
        font-family:"Calibri","sans-serif";
        color:windowtext;}
.MsoChpDefault
        {mso-style-type:export-only;}
@page Section1
        {size:612.0pt 792.0pt;
        margin:72.0pt 72.0pt 72.0pt 72.0pt;}
div.Section1
        {page:Section1;}
-->
</style>
<!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
</head>
<body lang=EN-US link=blue vlink=purple>
<div class=Section1>
<p class=MsoNormal>Hi list,<o:p></o:p></p>
<p class=MsoNormal><o:p> </o:p></p>
<p class=MsoNormal>We have a cluster with 44 cores on which we are evaluating OpenNebula.
All nodes run Ubuntu server 64bit. I’m trying to get Hadoop 0.20.1 to
work, so I’m following the cluster setup steps on hadoop.apache.org.<o:p></o:p></p>
<p class=MsoNormal><o:p> </o:p></p>
<p class=MsoNormal>As a start I created three VM’s running Ubuntu 9.10 32bit
desktop edition. After installing Sun’s 1.6 JRE I put Hadoop into my
homedir. I configured the three installations of Hadoop as follows:<o:p></o:p></p>
<p class=MsoNormal><o:p> </o:p></p>
<p class=MsoNormal>== conf/hadoop-env.sh ==<o:p></o:p></p>
<p class=MsoNormal>Set JAVA_HOME to the appropriate directory<o:p></o:p></p>
<p class=MsoNormal><o:p> </o:p></p>
<p class=MsoNormal>== conf/core-site.xml ==<o:p></o:p></p>
<p class=MsoNormal>Set fs.default.name to the ip address of the designated
namenode, on port 9000:<o:p></o:p></p>
<p class=MsoNormal><property><o:p></o:p></p>
<p class=MsoNormal> <name>fs.default.name</name><o:p></o:p></p>
<p class=MsoNormal> <value>hdfs://XXX.XXX.X.XXX:9000/</value><o:p></o:p></p>
<p class=MsoNormal> </property><o:p></o:p></p>
<p class=MsoNormal><o:p> </o:p></p>
<p class=MsoNormal>== conf/hdfs-site.xml ==<o:p></o:p></p>
<p class=MsoNormal>Set dfs.name.dir to a directory in my homedir:<o:p></o:p></p>
<p class=MsoNormal><property><o:p></o:p></p>
<p class=MsoNormal> <name>dfs.name.dir</name><o:p></o:p></p>
<p class=MsoNormal> <value>/home/cloud/var/log/hadoop/</value><o:p></o:p></p>
<p class=MsoNormal></property><o:p></o:p></p>
<p class=MsoNormal><o:p> </o:p></p>
<p class=MsoNormal>== conf/mapred-site.xml ==<o:p></o:p></p>
<p class=MsoNormal>Set mapred.job.tracker to the ip address of the designated
jobtracker, on port 9001:<o:p></o:p></p>
<p class=MsoNormal><property><o:p></o:p></p>
<p class=MsoNormal> <name>mapred.job.tracker</name><o:p></o:p></p>
<p class=MsoNormal> <value>XXX.XXX.X.XXX:9001</value><o:p></o:p></p>
<p class=MsoNormal><property><o:p></o:p></p>
<p class=MsoNormal><o:p> </o:p></p>
<p class=MsoNormal>Apart from Hadoop configuration I manually set the hostname for
the namenode, jobtracker and slave (datanode & tasktracker) to respectively
hadoop-namenode, hadoop-jobtracker and hadoop-slave01.<o:p></o:p></p>
<p class=MsoNormal><o:p> </o:p></p>
<p class=MsoNormal>I’m able to start the hdfs with bin/start-dfs.sh, and
mapreduce with bin/start-mapred.sh without any exceptions. However, when I now
try to copy files onto hdfs, I’m getting the following exception:<o:p></o:p></p>
<p class=MsoNormal><o:p> </o:p></p>
<p class=MsoNormal>$ bin/hadoop fs -put conf input<o:p></o:p></p>
<p class=MsoNormal>09/11/11 05:24:47 WARN hdfs.DFSClient: DataStreamer
Exception: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File
/user/cloud/input/capacity-scheduler.xml could only be replicated to 0 nodes,
instead of 1<o:p></o:p></p>
<p class=MsoNormal> at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1267)<o:p></o:p></p>
<p class=MsoNormal> at
org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:422)<o:p></o:p></p>
<p class=MsoNormal> at
sun.reflect.GeneratedMethodAccessor20.invoke(Unknown Source)<o:p></o:p></p>
<p class=MsoNormal> at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)<o:p></o:p></p>
<p class=MsoNormal> at java.lang.reflect.Method.invoke(Method.java:597)<o:p></o:p></p>
<p class=MsoNormal> at
org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)<o:p></o:p></p>
<p class=MsoNormal> at
org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)<o:p></o:p></p>
<p class=MsoNormal> at
org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)<o:p></o:p></p>
<p class=MsoNormal> at java.security.AccessController.doPrivileged(Native
Method)<o:p></o:p></p>
<p class=MsoNormal> at
javax.security.auth.Subject.doAs(Subject.java:396)<o:p></o:p></p>
<p class=MsoNormal> at
org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)<o:p></o:p></p>
<p class=MsoNormal><o:p> </o:p></p>
<p class=MsoNormal> at
org.apache.hadoop.ipc.Client.call(Client.java:739)<o:p></o:p></p>
<p class=MsoNormal> at
org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220)<o:p></o:p></p>
<p class=MsoNormal> at $Proxy0.addBlock(Unknown Source)<o:p></o:p></p>
<p class=MsoNormal> at
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)<o:p></o:p></p>
<p class=MsoNormal> at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)<o:p></o:p></p>
<p class=MsoNormal> at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)<o:p></o:p></p>
<p class=MsoNormal> at
java.lang.reflect.Method.invoke(Method.java:597)<o:p></o:p></p>
<p class=MsoNormal> at
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)<o:p></o:p></p>
<p class=MsoNormal> at
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)<o:p></o:p></p>
<p class=MsoNormal> at $Proxy0.addBlock(Unknown Source)<o:p></o:p></p>
<p class=MsoNormal> at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:2904)<o:p></o:p></p>
<p class=MsoNormal> at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2786)<o:p></o:p></p>
<p class=MsoNormal> at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2076)<o:p></o:p></p>
<p class=MsoNormal> at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2262)<o:p></o:p></p>
<p class=MsoNormal><o:p> </o:p></p>
<p class=MsoNormal>09/11/11 05:24:47 WARN hdfs.DFSClient: Error Recovery for
block null bad datanode[0] nodes == null<o:p></o:p></p>
<p class=MsoNormal>09/11/11 05:24:47 WARN hdfs.DFSClient: Could not get block
locations. Source file "/user/cloud/input/capacity-scheduler.xml" -
Aborting...<o:p></o:p></p>
<p class=MsoNormal>put: java.io.IOException: File
/user/cloud/input/capacity-scheduler.xml could only be replicated to 0 nodes,
instead of 1<o:p></o:p></p>
<p class=MsoNormal><o:p> </o:p></p>
<p class=MsoNormal>Can anybody shed light on this? I’m guessing it’s
a configuration issue, so that’s the direction I’m looking at.<o:p></o:p></p>
<p class=MsoNormal><o:p> </o:p></p>
<p class=MsoNormal>Another question I have is more generally about getting Hadoop
to work on a cloud. The issue I foresee is about the ip addresses of the
masters and slaves. How do I dynamically configure the hadoop instances during
start-up of the images to end up with a namenode, a jobtracker and a number of
slaves? I’ll need the ip addresses of all machines and all machines need
a unique hostname… Does anybody have any experience with this?<o:p></o:p></p>
<p class=MsoNormal><o:p> </o:p></p>
<p class=MsoNormal>Thanks in advance!<o:p></o:p></p>
<p class=MsoNormal><o:p> </o:p></p>
<p class=MsoNormal><span style='font-size:10.0pt;font-family:"Courier New"'>Evert
Lammerts<o:p></o:p></span></p>
<p class=MsoNormal><span style='font-size:10.0pt;font-family:"Courier New"'>Adviseur<o:p></o:p></span></p>
<p class=MsoNormal><span style='font-size:10.0pt;font-family:"Courier New"'>SARA
Computing & Network Services <o:p></o:p></span></p>
<p class=MsoNormal><span style='font-size:10.0pt;font-family:"Courier New"'>High
Performance Computing & Visualization<o:p></o:p></span></p>
<p class=MsoNormal><span style='font-size:10.0pt;font-family:"Courier New"'>eScience
Support Group<o:p></o:p></span></p>
<p class=MsoNormal><span style='font-size:10.0pt;font-family:"Courier New"'><o:p> </o:p></span></p>
<p class=MsoNormal><span style='font-size:10.0pt;font-family:"Courier New"'>Phone:
+31 20 888 4101<o:p></o:p></span></p>
<p class=MsoNormal><span style='font-size:10.0pt;font-family:"Courier New"'>Email:
evert.lammerts@sara.nl<o:p></o:p></span></p>
<p class=MsoNormal><o:p> </o:p></p>
</div>
</body>
</html>