[one-users] OpenNebula 3.0RC1 with LVM

Valentin Bud valentin.bud at gmail.com
Tue Sep 27 08:24:57 PDT 2011


Hello Community,

I want to build a cloud to automate the deployment of an application we use
in the company. The application is hard disk intensive so I want to use the
LVM driver
in order to share the load. It's not possible right now to build a central
NFS (shared) server that can deliver the needed performance. In the future I
consider doing just that.
I have started playing with OpenNebula a few weeks ago and configured the
infrastructure to use shared storage. Today I have updated to 3.0RC1 and
reconfigured ONE to use
lvm. For that I have removed the hosts from ONE and commented out the
tm_shared section and uncommented the LVM Transfer Manager Driver
Configuration.

First of all, ONE is installed on Ubuntu 10.04, the frontend and the nodes
also.

The documentation [1] states "The Transfer Manager makes the assumption that
the block devices defined in the VM template are available in all the
nodes". I have looked through the scripts and the assumption is true and
false in the same time. If the source defined in the template points to a
"/dev" LV than yes the assumption is true, but if you want to store the
images in a central location, for example the frontend (my case), you can do
just that and the Transfer Driver first creates the LV and then dumps the
image through SSH on that LV.

I have one frontend (front01) host and two nodes (node01 and node02). All
three of them are added as hosts in ONE and all three of them have a VG
called vg0. I have configured ONE to let it know about this. For this I have
setup VG_NAME="vg0" in /etc/one/tm_lvm/tm_lvmrc. Tried to start a VM without
any further configuration and sudo start complaining. And it didn't complain
for nothing, oneadmin didn't have the rights to create LVs. I have
configured sudo to allow oneadmin to create LVs.

The VM I want to deploy has the following template:
(front01)$ onetemplate show 0
TEMPLATE 0 INFORMATION

ID             : 0
NAME           : debian-squezee
USER           : oneadmin
GROUP          : oneadmin
REGISTER TIME  : 09/15 15:16:18
PUBLIC         : No

TEMPLATE CONTENTS

CPU=1
DISK=[
  BUS=ide,
  DRIVER=qcow2,
  IMAGE_ID=0 ]
DISK=[
  READONLY=no,
  SIZE=1024,
  TYPE=swap ]
GRAPHICS=[
  LISTEN=0.0.0.0,
  TYPE=vnc ]
MEMORY=1024
NAME=debian-squezee
NIC=[
  IP=192.168.150.11,
  NETWORK_ID=0 ]
OS=[
  ARCH=x86_64 ]
TEMPLATE_ID=0

As it can be seen from the above output the first disk points to the
IMAGE_ID=0. The image is configure as follows:
(front01)$ oneimage show 0
IMAGE 0 INFORMATION

ID             : 0
NAME           : debian-squeeze
USER           : oneadmin
GROUP          : oneadmin
TYPE           : OS
REGISTER TIME  : 09/15 15:13:44
PUBLIC         : Yes
PERSISTENT     : No
SOURCE         : /var/lib/one/images/5c55b64e99e37b5b8b0d7dde918a91f1
SIZE           : 612
STATE          : rdy
RUNNING_VMS    : 0

IMAGE TEMPLATE

DESCRIPTION="Debian Squeeze."
DEV_PREFIX=hd
NAME=debian-squeeze
PATH=/home/images/debian-squeeze.img
TYPE=OS

The /home/images/debian-squeeze.img is a qcow2 formatted image with a fresh
debian install created from scratch with kvm.

When I instantiate the VM with ID 0 I find the following things happen. On
the host the VM is deployed (node02) the LV is created on vg0.
(node02)# lvdisplay
 --- Logical volume ---
  LV Name                /dev/vg0/lv-one-19-0
  VG Name                vg0
  LV UUID                q2QSE5-6Uuw-dnB5-LW46-y3Me-nHR9-5Q4hp5
  LV Write Access        read/write
  LV Status              available
  # open                 0
  LV Size                1.00 GiB
  Current LE             256
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     256
  Block device           251:6

The /var/log/one/$VM_ID.log (the VM_ID is 19 in this case) on frontend
(front01):

Tue Sep 27 18:00:58 2011 [DiM][I]: New VM state is ACTIVE.
Tue Sep 27 18:00:58 2011 [LCM][I]: New VM state is PROLOG.
Tue Sep 27 18:00:58 2011 [VM][I]: Virtual Machine has no context
Tue Sep 27 18:02:06 2011 [TM][I]: tm_clone.sh:
front01:/var/lib/one/images/b92349a9f4ad479facf95f6706777d11
node02:/var/lib/one//19/images/disk.0
Tue Sep 27 18:02:06 2011 [TM][I]: tm_clone.sh: DST:
/var/lib/one//19/images/disk.0
Tue Sep 27 18:02:06 2011 [TM][I]: tm_clone.sh: Creating directory
/var/lib/one//19/images
Tue Sep 27 18:02:06 2011 [TM][I]: tm_clone.sh: Executed "ssh node02 mkdir -p
/var/lib/one//19/images".
Tue Sep 27 18:02:06 2011 [TM][I]: tm_clone.sh: Creating LV lv-one-19-0
Tue Sep 27 18:02:06 2011 [TM][I]: tm_clone.sh: Executed "ssh node02 sudo
lvcreate -L1G -n lv-one-19-0 vg0".
Tue Sep 27 18:02:06 2011 [TM][I]: tm_clone.sh: Executed "ssh node02 ln -s
/dev/vg0/lv-one-19-0 /var/lib/one//19/images/disk.0".
Tue Sep 27 18:02:06 2011 [TM][I]: tm_clone.sh: Dumping Image
Tue Sep 27 18:02:06 2011 [TM][I]: tm_clone.sh: Executed "eval cat
/var/lib/one/images/b92349a9f4ad479facf95f6706777d11 | ssh node02 sudo dd
of=/dev/vg0/lv-one-19-0 bs=64k".
Tue Sep 27 18:02:06 2011 [TM][I]: ExitCode: 0
Tue Sep 27 18:02:08 2011 [TM][I]: tm_mkswap.sh: Creating 1024Mb image in
/var/lib/one//19/images/disk.1
Tue Sep 27 18:02:08 2011 [TM][I]: tm_mkswap.sh: Executed "ssh node02 mkdir
-p /var/lib/one//19/images".
Tue Sep 27 18:02:08 2011 [TM][I]: tm_mkswap.sh: Executed "ssh node02 dd
if=/dev/zero of=/var/lib/one//19/images/disk.1 bs=1 count=1 seek=1024M".
Tue Sep 27 18:02:08 2011 [TM][I]: tm_mkswap.sh: Initializing swap space
Tue Sep 27 18:02:08 2011 [TM][I]: tm_mkswap.sh: Executed "ssh node02 mkswap
/var/lib/one//19/images/disk.1".
Tue Sep 27 18:02:08 2011 [TM][I]: tm_mkswap.sh: Executed "ssh node02 chmod
a+w /var/lib/one//19/images/disk.1".
Tue Sep 27 18:02:08 2011 [TM][I]: ExitCode: 0
Tue Sep 27 18:02:08 2011 [LCM][I]: New VM state is BOOT
Tue Sep 27 18:02:08 2011 [VMM][I]: Generating deployment file:
/var/lib/one/19/deployment.0
Tue Sep 27 18:02:39 2011 [VMM][I]: Command execution fail: 'if [ -x
"/var/tmp/one/vmm/kvm/deploy" ]; then /var/tmp/one/vmm/kvm/deploy
/var/lib/one//19/images/deployment.0 node02 19 node02; else
             exit 42; fi'
Tue Sep 27 18:02:39 2011 [VMM][I]: error: Failed to create domain from
/var/lib/one//19/images/deployment.0
Tue Sep 27 18:02:39 2011 [VMM][I]: error: monitor socket did not show up.:
Connection refused
Tue Sep 27 18:02:39 2011 [VMM][E]: Could not create domain from
/var/lib/one//19/images/deployment.0
Tue Sep 27 18:02:39 2011 [VMM][I]: ExitCode: 255
Tue Sep 27 18:02:39 2011 [VMM][E]: Error deploying virtual machine: Could
not create domain from /var/lib/one//19/images/deployment.0
Tue Sep 27 18:02:39 2011 [DiM][I]: New VM state is FAILED

The next thing I have done was to check the qemu
(/var/log/libvirt/qemu/one-19.log) log on node02. Here it goes:

LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/bin:/usr/sbin:/sbin:/bin
QEMU_AUDIO_DRV=none /usr/bin/kvm -S -M pc-0.12 -enable-kvm -m 1024 -smp 1
-name one-19 -uuid 4ed49c9f-b05b-b961-7fb2-4d24e06d0146 -chardev
socket,id=monitor,path=/var/lib/libvirt/qemu/one-19.monitor,server,nowait
-monitor chardev:monitor -boot c -drive
file=/var/lib/one//19/images/disk.0,if=ide,index=0,boot=on,format=qcow2
-drive file=/var/lib/one//19/images/disk.1,if=ide,index=3,format=raw -net
nic,macaddr=02:00:c0:a8:96:0c,vlan=0,name=nic.0 -net
tap,fd=40,vlan=0,name=tap.0 -serial none -parallel none -usb -vnc
0.0.0.0:19-vga cirrus
qemu: could not open disk image /var/lib/one//19/images/disk.0: Operation
not permitted

I have noticed that /var/lib/one/19/images/disk.0 is a symlink
to /dev/vg0/lv-one-19-0 which in turn is a symlink to
/dev/mapper/vg0-lv--one--19--0 and it has the following rights:

brw-rw---- 1 root disk 251, 6 2011-09-27 18:01
/dev/mapper/vg0-lv--one--19--0

I use Ubuntu 10.04 on hosts and first thing that come into mind was that
apparmor doesn't allow oneadmin to access the LV so I shut it down and
unloaded the apparmor profiles from kernel and tried again, same results. I
even thought, and I guess this is wrong because kvm and qemu run as root,
that oneadmin user needs to be in the disk group to have access to the LV. I
have added oneadmin to the disk group and tried again, exactly the same
behavior only that VM_ID changes in the logs.

So what to do next? Taking into account that I have only 2 weeks experience
with ONE I am out of options. Please shed some light on how to debug this
further.

[1] http://www.opennebula.org/documentation:rel3.0:lvm

Thank you and have a great day,
v
-- 
network warrior
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.opennebula.org/pipermail/users-opennebula.org/attachments/20110927/dd11bdef/attachment-0002.htm>


More information about the Users mailing list