[one-users] VM life cycle - error handling

Carlos Martín Sánchez cmartin at opennebula.org
Wed Mar 28 02:44:06 PDT 2012


Hi Danny,


On Thu, Mar 22, 2012 at 6:48 PM, Danny Sternkopf <danny.sternkopf at csc.fi>
 wrote:
>
> 1) onevm shutdown fails:
> [...]

However ONE already released the VMs IP and assigned it to another VM which
> of course cause a clash. I wonder if this is intended to work like this?
> Obviously ONE knows that the VM is still running so it should keep the
> associated IP allocated.
>

The network leases and the disk images are releases once the VM reaches the
DONE state only. If the shutdown timed out and the VM returned to RUNNING,
this should not happen. Are you sure the OpenNebula VM is in running state?
or did I misunderstand you?


> 2) onevm delete fails:
> It is similar to 1). virsh destroy gives an error (ExitCode: 42), but the
> transfer manager is wiping the disks even though the VM is still running.
> (but might be not fully functional anymore.) I also wonder if this makes
> any sense? In this case neither the user nor the administrator realize that
> the VM is still running unless you check the physical host locally or you
> take a look at the VM's log file.
>

Yes, in this case OpenNebula assumes that the destroy action always
succeeds. Unlike the graceful shutdown action, the VM is not monitored
after the delete action.


As a workaround to this erratic virsh failures, you can set a retry in the
IM and VMM drivers in oned.conf, using the -r argument option [1]

IM_MAD = [
    name       = "im_kvm",
    executable = "one_im_ssh",
    arguments  = "*-r 3* -t 15 kvm" ]

VM_MAD = [
    name       = "vmm_kvm",
    executable = "one_vmm_exec",
    arguments  = "-t 15 *-r 3* kvm",
    default    = "vmm_exec/vmm_exec_kvm.conf",
    type       = "kvm" ]

Regards

[1] http://opennebula.org/documentation:documentation:devel-vmm
--
Carlos Martín, MSc
Project Engineer
OpenNebula - The Open-source Solution for Data Center Virtualization
www.OpenNebula.org | cmartin at opennebula.org |
@OpenNebula<http://twitter.com/opennebula><cmartin at opennebula.org>



On Thu, Mar 22, 2012 at 6:48 PM, Danny Sternkopf <danny.sternkopf at csc.fi>wrote:

> Hi,
>
> I do encounter (very rarely as it seems) problems where VMs are not
> properly deleted or shut off by onevm commands. I use ONE 3.0, hosts
> running Fedora15 and KVM and libvirt.
>
> 1) onevm shutdown fails:
> I can see in the VM log file that the shutdown operation timed out and the
> VM is still running. Unfortunately I don't see the reason why 'virsh
> shutdown' failed. There is nothing in the system or libvirt logs. It looks
> for me that virsh can't properly communicate to the libvirtd. That is still
> harmless. However ONE already released the VMs IP and assigned it to
> another VM which of course cause a clash. I wonder if this is intended to
> work like this? Obviously ONE knows that the VM is still running so it
> should keep the associated IP allocated.
>
> 2) onevm delete fails:
> It is similar to 1). virsh destroy gives an error (ExitCode: 42), but the
> transfer manager is wiping the disks even though the VM is still running.
> (but might be not fully functional anymore.) I also wonder if this makes
> any sense? In this case neither the user nor the administrator realize that
> the VM is still running unless you check the physical host locally or you
> take a look at the VM's log file.
>
> I was not able to find out why virsh failed and could not reproduce it.
> The hosts are healthy, but might have a strange problem at the very moment
> when users requested a VM shutdown or deletion.
>
> For example I could manually run the same command ONE was executing later
> on and it worked. (/var/tmp/one/vmm/kvm/cancel one-20740 n020504 20740
> n020504)
>
> Any hints to libvirt issue?
>
> Regards,
>
> Danny
> ______________________________**_________________
> Users mailing list
> Users at lists.opennebula.org
> http://lists.opennebula.org/**listinfo.cgi/users-opennebula.**org<http://lists.opennebula.org/listinfo.cgi/users-opennebula.org>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.opennebula.org/pipermail/users-opennebula.org/attachments/20120328/2250a5a0/attachment-0003.htm>


More information about the Users mailing list