[one-users] VM life cycle - error handling

Danny Sternkopf danny.sternkopf at csc.fi
Thu Mar 22 10:48:17 PDT 2012


Hi,

I do encounter (very rarely as it seems) problems where VMs are not 
properly deleted or shut off by onevm commands. I use ONE 3.0, hosts 
running Fedora15 and KVM and libvirt.

1) onevm shutdown fails:
I can see in the VM log file that the shutdown operation timed out and 
the VM is still running. Unfortunately I don't see the reason why 'virsh 
shutdown' failed. There is nothing in the system or libvirt logs. It 
looks for me that virsh can't properly communicate to the libvirtd. That 
is still harmless. However ONE already released the VMs IP and assigned 
it to another VM which of course cause a clash. I wonder if this is 
intended to work like this? Obviously ONE knows that the VM is still 
running so it should keep the associated IP allocated.

2) onevm delete fails:
It is similar to 1). virsh destroy gives an error (ExitCode: 42), but 
the transfer manager is wiping the disks even though the VM is still 
running. (but might be not fully functional anymore.) I also wonder if 
this makes any sense? In this case neither the user nor the 
administrator realize that the VM is still running unless you check the 
physical host locally or you take a look at the VM's log file.

I was not able to find out why virsh failed and could not reproduce it. 
The hosts are healthy, but might have a strange problem at the very 
moment when users requested a VM shutdown or deletion.

For example I could manually run the same command ONE was executing 
later on and it worked. (/var/tmp/one/vmm/kvm/cancel one-20740 n020504 
20740 n020504)

Any hints to libvirt issue?

Regards,

Danny



More information about the Users mailing list