[one-users] Transfer manager

Marcos Dias de Assuncao marcosd at csse.unimelb.edu.au
Fri Nov 21 17:33:55 PST 2008


Hi all,

We would like to thank for the prompt help we have received from the  
members of this list. Also, we fully understand that the move to a  
new version/release with new features requires a lot of effort until  
the final release is fully functional. That being said, we want to  
say that we are happy to test OpenNebula and, if we find problems, we  
will certainly report them. It is not because we are annoying, but we  
understand that the comments may be helpful.

We have noticed a few things with the svn version of OpenNebula. We  
were testing the transfer manager on the past Friday and we were  
required to reboot the computers. I made some tests this Saturday and  
found some intriguing things. We noticed that after rebooting the  
hosts' free memory informed by the xen probe seemed a little strange.  
Then we started OpenNebula with no hosts and added two hosts again  
(i.e. gieseking and billabong), each with 2GB of RAM and a few MBs  
reserved to dom0 in xend-config.sxp. The command 'onehost list'  
reported:

  HID NAME                      RVM   TCPU   FCPU   ACPU    TMEM     
FMEM STAT
    0 gieseking                   0    200    196    200 2094080   
129024   on
    1 billabong                   0    200    200    200 2094080   
129024   on

We submitted a request whose template is:

NAME     = susebox
CPU      = 1
MEMORY   = 256
OS       = [kernel="/boot/vmlinuz-2.6.25.18-0.2-xen",initrd="/boot/ 
initrd-2.6.25.18-0.2-xen",kernel_cmd="rw",root="hda1"]
DISK     = [source="/home/oneadmin/vm/domains/opensuse11/ 
OpenSuse11.img",target="hda1",readonly="no", clone="no"]
NIC      = [mac="00:16:3e:01:01:01"]
#GRAPHICS = [type="vnc",listen="127.0.0.1",port="5900"]

and of course, 'onevm list' reported the VM as pending because the  
hosts did not have enough memory:

   ID     NAME STAT CPU     MEM        HOSTNAME        TIME
    6  susebox pend   0       0                 00 00:02:19

We found that the xen probe (ONE_LOCATION/lib/im_probes/xen.rb) uses  
'sudo xm info' to obtain information about the host. The command on  
billabong reveals:

total_memory           : 2045
free_memory            : 126
max_free_memory        : 1678
max_para_memory        : 1674
max_hvm_memory         : 1662

If I start the same VM directly on the host, xm will create the  
domain. The command 'xm info' shows the following during the VM's  
execution:

free_memory            : 1
max_free_memory        : 1422
max_para_memory        : 1418
max_hvm_memory         : 1406

A new run of 'sudo xm info' on billabong after the VM is shut down  
shows:

total_memory           : 2045
free_memory            : 257
max_free_memory        : 1678
max_para_memory        : 1674
max_hvm_memory         : 1662

And 'onehost list' now shows the updated information:

  HID NAME                      RVM   TCPU   FCPU   ACPU    TMEM     
FMEM STAT
    0 gieseking                   0    200    191    200 2094080   
129024   on
    1 billabong                   0    200    200    200 2094080   
263168   on

The reason for the change in free_memory is that dom0 will balloon  
out when needed to free memory for domUs. We believe that the probe  
should report max_free_memory. This way, we changed the following  
lines of ONE_LOCATION/lib/im_probes/xen.rb:

when 'free_memory'
         memory_info[:free]=columns[1].to_i*1024

to:

when 'max_free_memory'
         memory_info[:free]=columns[1].to_i*1024

So, now onehost reports the maximum free memory and schedules the  
VMs. However, as we are using NFS and we are not cloning the virtual  
machine images, we had another problem. When we submitted a VM, it  
failed and the vm.log showed:

Sat Nov 22 11:39:04 2008 [DiM][I]: New VM state is ACTIVE.
Sat Nov 22 11:39:04 2008 [LCM][I]: New VM state is PROLOG.
Sat Nov 22 11:39:04 2008 [TM][I]: tm_ln.sh: Link /home/oneadmin/vm/ 
domains/opensuse11/OpenSuse11.img
Sat Nov 22 11:39:04 2008 [TM][I]: tm_ln.sh: ERROR: Command "ln -s / 
home/oneadmin/vm/domains/opensuse11/OpenSuse11.img /home/oneadmin/ 
OpenNebula/var/7/images/disk.0" failed.
Sat Nov 22 11:39:04 2008 [TM][I]: tm_ln.sh: ERROR: ln: creating  
symbolic link `/home/oneadmin/OpenNebula/var/7/images/disk.0': No  
such file or directory
Sat Nov 22 11:39:04 2008 [TM][E]: Error excuting image transfer  
script: ln: creating symbolic link `/home/oneadmin/OpenNebula/var/7/ 
images/disk.0': No such file or directory
Sat Nov 22 11:39:05 2008 [DiM][I]: New VM state is FAILED

We solved the issue by modifying ONE_LOCATION/lib/tm_commands/nfs/ 
tm_ln.sh from:

#!/bin/bash

SRC=$1
DST=$2

. $ONE_LOCATION/libexec/tm_common.sh

SRC_PATH=`arg_path $SRC`
DST_PATH=`arg_path $DST`

log "Link $SRC_PATH"
exec_and_log "ln -s $SRC_PATH $DST_PATH"

to:

#!/bin/bash

SRC=$1
DST=$2

. $ONE_LOCATION/libexec/tm_common.sh

SRC_PATH=`arg_path $SRC`
DST_PATH=`arg_path $DST`

DST_DIR=`dirname $DST_PATH`

log "Creating directory $DST_DIR"
exec_and_log "mkdir -p $DST_DIR"
exec_and_log "chmod a+w $DST_DIR"

case $SRC in
http://*)
     log "Cannot link website $SRC"
     exit -1
     ;;

*)
     log "Link $SRC_PATH"
     exec_and_log "ln -s $SRC_PATH $DST_PATH"
     ;;
esac


After this last change, our VM worked like a charm and vm.log showed:

Sat Nov 22 11:50:03 2008 [DiM][I]: New VM state is ACTIVE.
Sat Nov 22 11:50:03 2008 [LCM][I]: New VM state is PROLOG.
Sat Nov 22 11:50:03 2008 [TM][I]: tm_ln.sh: Creating directory /home/ 
oneadmin/OpenNebula/var/9/images
Sat Nov 22 11:50:03 2008 [TM][I]: tm_ln.sh: Executed "mkdir -p /home/ 
oneadmin/OpenNebula/var/9/images".
Sat Nov 22 11:50:03 2008 [TM][I]: tm_ln.sh: Executed "chmod a+w /home/ 
oneadmin/OpenNebula/var/9/images".
Sat Nov 22 11:50:03 2008 [TM][I]: tm_ln.sh: Link /home/oneadmin/vm/ 
domains/opensuse11/OpenSuse11.img
Sat Nov 22 11:50:03 2008 [TM][I]: tm_ln.sh: Executed "ln -s /home/ 
oneadmin/vm/domains/opensuse11/OpenSuse11.img /home/oneadmin/ 
OpenNebula/var/9/images/disk.0".
Sat Nov 22 11:50:03 2008 [LCM][I]: New VM state is BOOT
Sat Nov 22 11:50:03 2008 [VMM][I]: Generating deployment file: /home/ 
oneadmin/OpenNebula/var/9/deployment.0
Sat Nov 22 11:50:03 2008 [VMM][I]: Command: scp /home/oneadmin/ 
OpenNebula/var/9/deployment.0 billabong:/home/oneadmin/OpenNebula/var/ 
9/images/deployment.0
Sat Nov 22 11:50:03 2008 [VMM][I]: Copy success
Sat Nov 22 11:50:03 2008 [VMM][I]: Setting credits for the VM
Sat Nov 22 11:50:03 2008 [VMM][I]: Command: sudo /usr/sbin/xm create / 
home/oneadmin/OpenNebula/var/9/images/deployment.0 \&\& sudo /usr/ 
sbin/xm sched-cred -d one-9 -w 256
Sat Nov 22 11:50:08 2008 [LCM][I]: New VM state is RUNNING
Sat Nov 22 11:50:35 2008 [VMM][I]: Monitor Information:
         CPU   : 0
         Memory: 262144
         Net_TX: 3
         Net_RX: 3


Another small thing we noticed is that 'onevm submit' does not return  
the VM id anymore.

Again, we are very grateful for your help.

Regards,

Marcos


Marcos Dias de Assuncao
Grid Computing and Distributed Systems (GRIDS) Laboratory
Department of Computer Science and Software Engineering
The University of Melbourne, Australia
Email: marcosd at csse.unimelb.edu.au

-------------
"The bankers own the earth. Take it away from them, but leave them  
the power to create money, and with the flick of the pen they will  
create enough money to buy it back again.
However, take away from them the power to create money, and all the  
great fortunes like mine will disappear and they ought to disappear,  
for this would be a happier and better world to live in. But, if you  
wish to remain the slaves of bankers and pay the cost of your own  
slavery, let them continue to create money."

Sir Josiah Stamp
Former Director of The Bank of England




More information about the Users mailing list