Hello Community,<div><br></div><div>I want to build a cloud to automate the deployment of an application we use in the company. The application is hard disk intensive so I want to use the LVM driver</div><div>in order to share the load. It's not possible right now to build a central NFS (shared) server that can deliver the needed performance. In the future I consider doing just that.</div>
<div>I have started playing with OpenNebula a few weeks ago and configured the infrastructure to use shared storage. Today I have updated to 3.0RC1 and reconfigured ONE to use</div><div>lvm. For that I have removed the hosts from ONE and commented out the tm_shared section and uncommented the LVM Transfer Manager Driver Configuration.</div>
<div><br></div><div>First of all, ONE is installed on Ubuntu 10.04, the frontend and the nodes also.</div><div><br></div><div>The documentation [1] states "The Transfer Manager makes the assumption that the block devices defined in the VM template are available in all the nodes". I have looked through the scripts and the assumption is true and false in the same time. If the source defined in the template points to a "/dev" LV than yes the assumption is true, but if you want to store the images in a central location, for example the frontend (my case), you can do just that and the Transfer Driver first creates the LV and then dumps the image through SSH on that LV.<br clear="all">
<div><br></div><div>I have one frontend (front01) host and two nodes (node01 and node02). All three of them are added as hosts in ONE and all three of them have a VG called vg0. I have configured ONE to let it know about this. For this I have setup VG_NAME="vg0" in /etc/one/tm_lvm/tm_lvmrc. Tried to start a VM without any further configuration and sudo start complaining. And it didn't complain for nothing, oneadmin didn't have the rights to create LVs. I have configured sudo to allow oneadmin to create LVs.</div>
<div><br></div><div>The VM I want to deploy has the following template:</div><div>(front01)$ onetemplate show 0</div><div><div>TEMPLATE 0 INFORMATION </div><div>ID : 0 </div>
<div>NAME : debian-squezee </div><div>USER : oneadmin </div><div>GROUP : oneadmin </div><div>REGISTER TIME : 09/15 15:16:18 </div><div>PUBLIC : No </div>
<div><br></div><div>TEMPLATE CONTENTS </div><div>CPU=1</div><div>DISK=[</div><div> BUS=ide,</div><div> DRIVER=qcow2,</div><div> IMAGE_ID=0 ]</div><div>DISK=[</div>
<div> READONLY=no,</div><div> SIZE=1024,</div><div> TYPE=swap ]</div><div>GRAPHICS=[</div><div> LISTEN=0.0.0.0,</div><div> TYPE=vnc ]</div><div>MEMORY=1024</div><div>NAME=debian-squezee</div><div>NIC=[</div><div> IP=192.168.150.11,</div>
<div> NETWORK_ID=0 ]</div><div>OS=[</div><div> ARCH=x86_64 ]</div><div>TEMPLATE_ID=0</div></div><div><br></div><div>As it can be seen from the above output the first disk points to the IMAGE_ID=0. The image is configure as follows:</div>
<div>(front01)$ oneimage show 0</div><div><div>IMAGE 0 INFORMATION </div><div>ID : 0 </div><div>NAME : debian-squeeze </div>
<div>USER : oneadmin </div><div>GROUP : oneadmin </div><div>TYPE : OS </div><div>REGISTER TIME : 09/15 15:13:44 </div><div>PUBLIC : Yes </div>
<div>PERSISTENT : No </div><div>SOURCE : /var/lib/one/images/5c55b64e99e37b5b8b0d7dde918a91f1</div><div>SIZE : 612 </div><div>STATE : rdy </div>
<div>RUNNING_VMS : 0 </div><div><br></div><div>IMAGE TEMPLATE </div><div>DESCRIPTION="Debian Squeeze."</div><div>DEV_PREFIX=hd</div>
<div>NAME=debian-squeeze</div><div>PATH=/home/images/debian-squeeze.img</div><div>TYPE=OS</div></div><div><br></div><div>The /home/images/debian-squeeze.img is a qcow2 formatted image with a fresh debian install created from scratch with kvm.</div>
<div><br></div><div>When I instantiate the VM with ID 0 I find the following things happen. On the host the VM is deployed (node02) the LV is created on vg0.</div><div>(node02)# lvdisplay</div><div><div> --- Logical volume ---</div>
<div> LV Name /dev/vg0/lv-one-19-0</div><div> VG Name vg0</div><div> LV UUID q2QSE5-6Uuw-dnB5-LW46-y3Me-nHR9-5Q4hp5</div><div> LV Write Access read/write</div><div>
LV Status available</div>
<div> # open 0</div><div> LV Size 1.00 GiB</div><div> Current LE 256</div><div> Segments 1</div><div> Allocation inherit</div><div> Read ahead sectors auto</div>
<div> - currently set to 256</div><div> Block device 251:6</div></div><div><br></div><div>The /var/log/one/$VM_ID.log (the VM_ID is 19 in this case) on frontend (front01):</div><div><br></div><div><div>Tue Sep 27 18:00:58 2011 [DiM][I]: New VM state is ACTIVE.</div>
<div>Tue Sep 27 18:00:58 2011 [LCM][I]: New VM state is PROLOG.</div><div>Tue Sep 27 18:00:58 2011 [VM][I]: Virtual Machine has no context</div><div>Tue Sep 27 18:02:06 2011 [TM][I]: tm_clone.sh: front01:/var/lib/one/images/b92349a9f4ad479facf95f6706777d11 node02:/var/lib/one//19/images/disk.0</div>
<div>Tue Sep 27 18:02:06 2011 [TM][I]: tm_clone.sh: DST: /var/lib/one//19/images/disk.0</div><div>Tue Sep 27 18:02:06 2011 [TM][I]: tm_clone.sh: Creating directory /var/lib/one//19/images</div><div>Tue Sep 27 18:02:06 2011 [TM][I]: tm_clone.sh: Executed "ssh node02 mkdir -p /var/lib/one//19/images".</div>
<div>Tue Sep 27 18:02:06 2011 [TM][I]: tm_clone.sh: Creating LV lv-one-19-0</div><div>Tue Sep 27 18:02:06 2011 [TM][I]: tm_clone.sh: Executed "ssh node02 sudo lvcreate -L1G -n lv-one-19-0 vg0".</div><div>Tue Sep 27 18:02:06 2011 [TM][I]: tm_clone.sh: Executed "ssh node02 ln -s /dev/vg0/lv-one-19-0 /var/lib/one//19/images/disk.0".</div>
<div>Tue Sep 27 18:02:06 2011 [TM][I]: tm_clone.sh: Dumping Image</div><div>Tue Sep 27 18:02:06 2011 [TM][I]: tm_clone.sh: Executed "eval cat /var/lib/one/images/b92349a9f4ad479facf95f6706777d11 | ssh node02 sudo dd of=/dev/vg0/lv-one-19-0 bs=64k".</div>
<div>Tue Sep 27 18:02:06 2011 [TM][I]: ExitCode: 0</div><div>Tue Sep 27 18:02:08 2011 [TM][I]: tm_mkswap.sh: Creating 1024Mb image in /var/lib/one//19/images/disk.1</div><div>Tue Sep 27 18:02:08 2011 [TM][I]: tm_mkswap.sh: Executed "ssh node02 mkdir -p /var/lib/one//19/images".</div>
<div>Tue Sep 27 18:02:08 2011 [TM][I]: tm_mkswap.sh: Executed "ssh node02 dd if=/dev/zero of=/var/lib/one//19/images/disk.1 bs=1 count=1 seek=1024M".</div><div>Tue Sep 27 18:02:08 2011 [TM][I]: tm_mkswap.sh: Initializing swap space</div>
<div>Tue Sep 27 18:02:08 2011 [TM][I]: tm_mkswap.sh: Executed "ssh node02 mkswap /var/lib/one//19/images/disk.1".</div><div>Tue Sep 27 18:02:08 2011 [TM][I]: tm_mkswap.sh: Executed "ssh node02 chmod a+w /var/lib/one//19/images/disk.1".</div>
<div>Tue Sep 27 18:02:08 2011 [TM][I]: ExitCode: 0</div><div>Tue Sep 27 18:02:08 2011 [LCM][I]: New VM state is BOOT</div><div>Tue Sep 27 18:02:08 2011 [VMM][I]: Generating deployment file: /var/lib/one/19/deployment.0</div>
<div>Tue Sep 27 18:02:39 2011 [VMM][I]: Command execution fail: 'if [ -x "/var/tmp/one/vmm/kvm/deploy" ]; then /var/tmp/one/vmm/kvm/deploy /var/lib/one//19/images/deployment.0 node02 19 node02; else exit 42; fi'</div>
<div>Tue Sep 27 18:02:39 2011 [VMM][I]: error: Failed to create domain from /var/lib/one//19/images/deployment.0</div><div>Tue Sep 27 18:02:39 2011 [VMM][I]: error: monitor socket did not show up.: Connection refused</div>
<div>Tue Sep 27 18:02:39 2011 [VMM][E]: Could not create domain from /var/lib/one//19/images/deployment.0</div><div>Tue Sep 27 18:02:39 2011 [VMM][I]: ExitCode: 255</div><div>Tue Sep 27 18:02:39 2011 [VMM][E]: Error deploying virtual machine: Could not create domain from /var/lib/one//19/images/deployment.0</div>
<div>Tue Sep 27 18:02:39 2011 [DiM][I]: New VM state is FAILED</div></div><div><br></div><div>The next thing I have done was to check the qemu (/var/log/libvirt/qemu/one-19.log) log on node02. Here it goes:</div><div><br>
</div><div><div>LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/bin:/usr/sbin:/sbin:/bin QEMU_AUDIO_DRV=none /usr/bin/kvm -S -M pc-0.12 -enable-kvm -m 1024 -smp 1 -name one-19 -uuid 4ed49c9f-b05b-b961-7fb2-4d24e06d0146 -chardev socket,id=monitor,path=/var/lib/libvirt/qemu/one-19.monitor,server,nowait -monitor chardev:monitor -boot c -drive file=/var/lib/one//19/images/disk.0,if=ide,index=0,boot=on,format=qcow2 -drive file=/var/lib/one//19/images/disk.1,if=ide,index=3,format=raw -net nic,macaddr=02:00:c0:a8:96:0c,vlan=0,name=nic.0 -net tap,fd=40,vlan=0,name=tap.0 -serial none -parallel none -usb -vnc <a href="http://0.0.0.0:19">0.0.0.0:19</a> -vga cirrus </div>
<div>qemu: could not open disk image /var/lib/one//19/images/disk.0: Operation not permitted</div></div><div><br></div><div>I have noticed that /var/lib/one/19/images/disk.0 is a symlink to /dev/vg0/lv-one-19-0 which in turn is a symlink to /dev/mapper/vg0-lv--one--19--0 and it has the following rights:</div>
<div><br></div><div>brw-rw---- 1 root disk 251, 6 2011-09-27 18:01 /dev/mapper/vg0-lv--one--19--0</div><div><br></div><div>I use Ubuntu 10.04 on hosts and first thing that come into mind was that apparmor doesn't allow oneadmin to access the LV so I shut it down and unloaded the apparmor profiles from kernel and tried again, same results. I even thought, and I guess this is wrong because kvm and qemu run as root, that oneadmin user needs to be in the disk group to have access to the LV. I have added oneadmin to the disk group and tried again, exactly the same behavior only that VM_ID changes in the logs.</div>
<div><br></div><div>So what to do next? Taking into account that I have only 2 weeks experience with ONE I am out of options. Please shed some light on how to debug this further.</div><div><br></div><div>[1] <a href="http://www.opennebula.org/documentation:rel3.0:lvm">http://www.opennebula.org/documentation:rel3.0:lvm</a></div>
<div><br></div><div>Thank you and have a great day,</div><div>v</div>-- <br>network warrior<br>
</div>