<html><body><div style="color:#000; background-color:#fff; font-family:arial, helvetica, sans-serif;font-size:12pt"><div><span><br></span></div>Hello Shank,<br><br>I thank you so much for your description. <span style="background-color: rgb(255, 255, 0);" name="textmarker_4" id="textmarked_3"></span><span style="background-color: rgb(255, 255, 0);" name="textmarker_3" id="textmarked_2"></span><span style="background-color: rgb(255, 255, 0);" name="textmarker_2" id="textmarked_1"></span><br><br>But we are using virtual machines, so changes and configuration on each virtual node will be lost when it terminates. (1)<br>Every time<span style="background-color: rgb(255, 255, 0);" name="textmarker_25" id="textmarked_23"></span><span style="background-color: rgb(255, 255, 0);" name="textmarker_24" id="textmarked_22"></span><span style="background-color: rgb(255, 255, 0);" name="textmarker_23" id="textmarked_21"></span><span style="background-color: rgb(255,
255, 0);" name="textmarker_22" id="textmarked_20"></span><span style="background-color: rgb(255, 255, 0);" name="textmarker_21" id="textmarked_19"></span><span style="background-color: rgb(255, 255, 0);" name="textmarker_20" id="textmarked_18"></span><span style="background-color: rgb(255, 255, 0);" name="textmarker_19" id="textmarked_17"></span><span style="background-color: rgb(255, 255, 0);" name="textmarker_18" id="textmarked_16"></span><span style="background-color: rgb(255, 255, 0);" name="textmarker_17" id="textmarked_15"></span><span style="background-color: rgb(255, 255, 0);" name="textmarker_16" id="textmarked_14"></span><span style="background-color: rgb(255, 255, 0);" name="textmarker_15" id="textmarked_13"></span><span style="background-color: rgb(255, 255, 0);" name="textmarker_14" id="textmarked_12"></span><span style="background-color: rgb(255, 255, 0);" name="textmarker_13" id="textmarked_11"></span><span style="background-color:
rgb(255, 255, 0);" name="textmarker_12" id="textmarked_10"></span><span style="background-color: rgb(255, 255, 0);" name="textmarker_11" id="textmarked_9"></span><span style="background-color: rgb(255, 255, 0);" name="textmarker_10" id="textmarked_8"></span><span style="background-color: rgb(255, 255, 0);" name="textmarker_9" id="textmarked_7"></span><span style="background-color: rgb(255, 255, 0);" name="textmarker_8" id="textmarked_6"></span><span style="background-color: rgb(255, 255, 0);" name="textmarker_7" id="textmarked_5"></span><span style="background-color: rgb(255, 255, 0);" name="textmarker_5" id="textmarked_4"></span> we launch a cluster, we may change the number of slave nodes, up to the problem size. (2)<br><br>So could you be kind enough to help me make a small <span style="background-color: rgb(255, 255, 0);" name="textmarker_97" id="textmarked_95"></span>thing clear <span style="background-color: rgb(255, 255, 0);" name="textmarker_51"
id="textmarked_49"></span><span style="background-color: rgb(255, 255, 0);" name="textmarker_50" id="textmarked_48"></span><span style="background-color: rgb(255, 255, 0);" name="textmarker_49" id="textmarked_47"></span>once again<span style="background-color: rgb(255, 255, 0);" name="textmarker_48" id="textmarked_46"></span><span style="background-color: rgb(255, 255, 0);" name="textmarker_47" id="textmarked_45"></span><span style="background-color: rgb(255, 255, 0);" name="textmarker_46" id="textmarked_44"></span><span style="background-color: rgb(255, 255, 0);" name="textmarker_45" id="textmarked_43"></span><span style="background-color: rgb(255, 255, 0);" name="textmarker_44" id="textmarked_42"></span><span style="background-color: rgb(255, 255, 0);" name="textmarker_43" id="textmarked_41"></span><span style="background-color: rgb(255, 255, 0);" name="textmarker_42" id="textmarked_40"></span><span style="background-color: rgb(255, 255, 0);"
name="textmarker_41" id="textmarked_39"></span><span style="background-color: rgb(255, 255, 0);" name="textmarker_40" id="textmarked_38"></span>, as I don't know about Cloudera yet.<span style="background-color: rgb(255, 255, 0);" name="textmarker_128" id="textmarked_125"></span><span style="background-color: rgb(255, 255, 0);" name="textmarker_127" id="textmarked_124"></span><span style="background-color: rgb(255, 255, 0);" name="textmarker_126" id="textmarked_123"></span><span style="background-color: rgb(255, 255, 0);" name="textmarker_125" id="textmarked_122"></span><br><br>For example, we do not know in advance how many slave nodes will be launched (because of (<span class="tab"></span>2))<span style="background-color: rgb(255, 255, 0);" name="textmarker_190" id="textmarked_186"></span>, what their IP addresses are, so the configuration on /etc/hosts<span class="tab"> is left util the time the cluster is launched.<span style="background-color:
rgb(255, 255, 0);" name="textmarker_172" id="textmarked_168"></span><span style="background-color: rgb(255, 255, 0);" name="textmarker_171" id="textmarked_167"></span><span style="background-color: rgb(255, 255, 0);" name="textmarker_170" id="textmarked_166"></span></span><span style="background-color: rgb(255, 255, 0);" name="textmarker_169" id="textmarked_165"></span><span style="background-color: rgb(255, 255, 0);" name="textmarker_168" id="textmarked_164"></span><span style="background-color: rgb(255, 255, 0);" name="textmarker_167" id="textmarked_163"></span><span style="background-color: rgb(255, 255, 0);" name="textmarker_166" id="textmarked_162"></span><span style="background-color: rgb(255, 255, 0);" name="textmarker_165" id="textmarked_161"></span><span style="background-color: rgb(255, 255, 0);" name="textmarker_164" id="textmarked_160"></span><span style="background-color: rgb(255, 255, 0);" name="textmarker_163"
id="textmarked_159"></span><span style="background-color: rgb(255, 255, 0);" name="textmarker_162" id="textmarked_158"></span><span style="background-color: rgb(255, 255, 0);" name="textmarker_161" id="textmarked_157"></span><span style="background-color: rgb(255, 255, 0);" name="textmarker_160" id="textmarked_156"></span><span style="background-color: rgb(255, 255, 0);" name="textmarker_159" id="textmarked_155"></span><span style="background-color: rgb(255, 255, 0);" name="textmarker_158" id="textmarked_154"></span><span style="background-color: rgb(255, 255, 0);" name="textmarker_157" id="textmarked_153"></span><span style="background-color: rgb(255, 255, 0);" name="textmarker_156" id="textmarked_152"></span><span style="background-color: rgb(255, 255, 0);" name="textmarker_155" id="textmarked_151"></span><span style="background-color: rgb(255, 255, 0);" name="textmarker_154" id="textmarked_150"></span><span style="background-color: rgb(255, 255, 0);"
name="textmarker_153" id="textmarked_149"></span> If the cluster size is big, it'd be difficult. <span style="background-color: rgb(255, 255, 0);" name="textmarker_186" id="textmarked_182"></span><span style="background-color: rgb(255, 255, 0);" name="textmarker_185" id="textmarked_181"></span>Do you configure this manually? Or will OpenNebula context script help? Or will Cloudera do?<br><br>Thank you again.<br><br>Quynh<span style="background-color: rgb(255, 255, 0);" name="textmarker_189" id="textmarked_185"></span><br><span style="background-color: rgb(255, 255, 0);" name="textmarker_188" id="textmarked_184"></span><span style="background-color: rgb(255, 255, 0);" name="textmarker_187" id="textmarked_183"></span><br><span style="background-color: rgb(255, 255, 0);" name="textmarker_181" id="textmarked_177"></span><span style="background-color: rgb(255, 255, 0);" name="textmarker_180" id="textmarked_176"></span><span style="background-color: rgb(255,
255, 0);" name="textmarker_179" id="textmarked_175"></span><span style="background-color: rgb(255, 255, 0);" name="textmarker_178" id="textmarked_174"></span><span style="background-color: rgb(255, 255, 0);" name="textmarker_177" id="textmarked_173"></span><span style="background-color: rgb(255, 255, 0);" name="textmarker_176" id="textmarked_172"></span> <span style="background-color: rgb(255, 255, 0);" name="textmarker_175" id="textmarked_171"></span><span style="background-color: rgb(255, 255, 0);" name="textmarker_174" id="textmarked_170"></span><span style="background-color: rgb(255, 255, 0);" name="textmarker_173" id="textmarked_169"></span><br>Anyway, 30min for 50-VM cluster is impressive. What propagation <span style="background-color: rgb(255, 255, 0);" name="textmarker_141" id="textmarked_138"></span><span style="background-color: rgb(255, 255, 0);" name="textmarker_140" id="textmarked_137"></span><span style="background-color: rgb(255, 255,
0);" name="textmarker_139" id="textmarked_136"></span><span style="background-color: rgb(255, 255, 0);" name="textmarker_138" id="textmarked_135"></span><span style="background-color: rgb(255, 255, 0);" name="textmarker_137" id="textmarked_134"></span><span style="background-color: rgb(255, 255, 0);" name="textmarker_136" id="textmarked_133"></span>method you used? How big a VM image? <span style="background-color: rgb(255, 255, 0);" name="textmarker_149" id="textmarked_145"></span><span style="background-color: rgb(255, 255, 0);" name="textmarker_148" id="textmarked_144"></span><span style="background-color: rgb(255, 255, 0);" name="textmarker_147" id="textmarked_143"></span>As the typical LAN speed is 12MB/s (100Mbps) <span style="background-color: rgb(255, 255, 0);" name="textmarker_184" id="textmarked_180"></span><span style="background-color: rgb(255, 255, 0);" name="textmarker_183" id="textmarked_179"></span><span style="background-color: rgb(255,
255, 0);" name="textmarker_182" id="textmarked_178"></span>if you use scp.<span style="background-color: rgb(255, 255, 0);" name="textmarker_146" id="textmarked_142"></span><span style="background-color: rgb(255, 255, 0);" name="textmarker_145" id="textmarked_141"></span><span style="background-color: rgb(255, 255, 0);" name="textmarker_144" id="textmarked_140"></span><span style="background-color: rgb(255, 255, 0);" name="textmarker_143" id="textmarked_139"></span> (I don't try NFS yet).<span style="background-color: rgb(255, 255, 0);" name="textmarker_152" id="textmarked_148"></span><span style="background-color: rgb(255, 255, 0);" name="textmarker_151" id="textmarked_147"></span><span style="background-color: rgb(255, 255, 0);" name="textmarker_150" id="textmarked_146"></span><br><br><span style="background-color: rgb(255, 255, 0);" name="textmarker_121" id="textmarked_118"></span><span style="background-color: rgb(255, 255, 0);"
name="textmarker_120" id="textmarked_117"></span><span style="background-color: rgb(255, 255, 0);" name="textmarker_119" id="textmarked_116"></span><span style="background-color: rgb(255, 255, 0);" name="textmarker_118" id="textmarked_115"></span><span style="background-color: rgb(255, 255, 0);" name="textmarker_117" id="textmarked_114"></span><span style="background-color: rgb(255, 255, 0);" name="textmarker_116" id="textmarked_113"></span><span style="background-color: rgb(255, 255, 0);" name="textmarker_115" id="textmarked_112"></span><span style="background-color: rgb(255, 255, 0);" name="textmarker_114" id="textmarked_111"></span><span style="background-color: rgb(255, 255, 0);" name="textmarker_113" id="textmarked_110"></span><span style="background-color: rgb(255, 255, 0);" name="textmarker_112" id="textmarked_109"></span><span style="background-color: rgb(255, 255, 0);" name="textmarker_111" id="textmarked_108"></span><span
style="background-color: rgb(255, 255, 0);" name="textmarker_110" id="textmarked_107"></span><span style="background-color: rgb(255, 255, 0);" name="textmarker_109" id="textmarked_106"></span><div style="font-family: arial, helvetica, sans-serif; font-size: 12pt;"><div style="font-family: times new roman, new york, times, serif; font-size: 12pt;"><div id="yiv593088101"><div><div><span></span></div></div><div>Well, you can do the following<div><br></div><div><ul><li>Create
a master node template and slave node template, make their
configuration such that they keep that relationship, one way ssh key etc</li><li>Deploy the master node and configure the software on it then deploy as many slave nodes to connect back to the master node.</li><li>In
our configuration, we deploy M identical nodes, then we pick one of the
nodes as master and install the master node software (cloudera manager
in our case).</li><li>Then we use cloudera manager to deploy the rest of the nodes, in our
case this includes one hdfs name node, one job tracker, (M-3) hdfs data
nodes and (M-3) map reduce task trackers.</li><li>We have deployed around 50 VMs within a 30 min period using this configuration.</li></ul></div><div><br></div></div><div>Shank</div><div><br></div><div><div class="yiv593088101gmail_quote">On Fri, Jul 13, 2012 at 2:58 AM, Quynh Le <span dir="ltr"><<a rel="nofollow" ymailto="mailto:lhnquynh@yahoo.com" target="_blank" href="mailto:lhnquynh@yahoo.com">lhnquynh@yahoo.com</a>></span> wrote:<br>
<blockquote class="yiv593088101gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;"><div style="font-family:arial, helvetica, sans-serif;font-size:12pt;">Hello Shankhadeep,<br><br>Thank you for your information. I am able to setup such a virtual cluster using another cloud middleware like OpenNebula so I can understand the situation. What I wanna make clear is:<br>
- This is a kind of Master/Slave cluster: 1 head node and N worker nodes.<br>- We can launch a group of VMs to make N+1 VMs for the cluster. <br>- Then, do you have to setup hadoop master node and worker nodes manually, OR are they (VMs) automatically configured to be "1 master + N workers".<br>
- In this case, how many VM images you use? 1 VM image for master node, 1 for worker nodes, or 1 for all?<br><br>I'm looking forward to your sharing.<br><br>Cheers,<br>Quynh</div><br> </blockquote></div></div></div></div></div><br></div></body></html>