[one-users] "onevm delete" deletes extra VM

Tino Vazquez tinova at fdi.ucm.es
Wed Sep 30 09:30:35 PDT 2009


Hi Shi,

It seems that the VM 77 has failed for some KVM or virsh related issue
when you deleted the VM 76, because OpenNebula cannot longer find the
one-77 domain.

Regards,

-Tino

--
Constantino Vázquez, Grid Technology Engineer/Researcher:
http://www.dsa-research.org/tinova
DSA Research Group: http://dsa-research.org
Globus GridWay Metascheduler: http://www.GridWay.org
OpenNebula Virtual Infrastructure Engine: http://www.OpenNebula.org



On Wed, Sep 30, 2009 at 6:09 PM, Shi Jin <jinzishuai at gmail.com> wrote:
> Here is the vm.log for 76:
> Wed Sep 30 08:57:50 2009 [VMM][D]: Monitor Information:
>        CPU   : -1
>        Memory: 1048576
>        Net_TX: -1
>        Net_RX: -1
> Wed Sep 30 08:57:56 2009 [DiM][I]: New VM state is DONE
> Wed Sep 30 08:57:56 2009 [VMM][W]: Ignored: LOG - 76 Driver command for 76
> cancelled
>
> Wed Sep 30 08:57:56 2009 [VMM][W]: Ignored: CANCEL SUCCESS 76 -
>
> Wed Sep 30 08:58:00 2009 [TM][W]: Ignored: LOG - 76 tm_delete.sh: Deleting
> /srv/cloud/one/var/76/images
>
> Wed Sep 30 08:58:00 2009 [TM][W]: Ignored: LOG - 76 tm_delete.sh: Executed
> "rm -rf /srv/cloud/one/var/76/images".
>
> Wed Sep 30 08:58:00 2009 [TM][W]: Ignored: TRANSFER SUCCESS 76 -
>
> And vm.log for 77:
> Wed Sep 30 08:57:50 2009 [VMM][D]: Monitor Information:
>        CPU   : -1
>        Memory: 1048576
>        Net_TX: -1
>        Net_RX: -1
> Wed Sep 30 08:58:50 2009 [VMM][I]: Command execution fail: virsh dominfo
> one-77
> Wed Sep 30 08:58:50 2009 [VMM][I]: STDERR follows.
> Wed Sep 30 08:58:50 2009 [VMM][I]: Connecting to uri: qemu:///system
> Wed Sep 30 08:58:50 2009 [VMM][I]: error: failed to get domain 'one-77'
> Wed Sep 30 08:58:50 2009 [VMM][I]: error: Domain not found
> Wed Sep 30 08:58:50 2009 [VMM][I]: ExitCode: 1
> Wed Sep 30 08:58:50 2009 [VMM][I]: VM running but it was not found. Restart
> and delete actions available or try to recover it manually
> Wed Sep 30 08:58:50 2009 [LCM][I]: New VM state is UNKNOWN
> Wed Sep 30 08:59:50 2009 [VMM][I]: Command execution fail: virsh dominfo
> one-77
> Wed Sep 30 08:59:50 2009 [VMM][I]: STDERR follows.
> Wed Sep 30 08:59:50 2009 [VMM][I]: Connecting to uri: qemu:///system
> Wed Sep 30 08:59:50 2009 [VMM][I]: error: failed to get domain 'one-77'
> Wed Sep 30 08:59:50 2009 [VMM][I]: error: Domain not found
> Wed Sep 30 08:59:50 2009 [VMM][I]: ExitCode: 1
> Wed Sep 30 08:59:50 2009 [VMM][I]: VM running but it was not found. Restart
> and delete actions available or try to recover it manually
>
> And oned.log:
> Wed Sep 30 08:57:55 2009 [ReM][D]: VirtualMachineAction invoked
> Wed Sep 30 08:57:55 2009 [DiM][D]: Finalizing VM 76
> Wed Sep 30 08:57:56 2009 [VMM][D]: Message received: LOG - 76 Driver command
> for 76 cancelled
> Wed Sep 30 08:57:56 2009 [VMM][D]: Message received: CANCEL SUCCESS 76 -
> Wed Sep 30 08:57:58 2009 [ReM][D]: VirtualMachinePoolInfo method invoked
> Wed Sep 30 08:58:00 2009 [TM][D]: Message received: LOG - 76 tm_delete.sh:
> Deleting /srv/cloud/one/var/76/images
> Wed Sep 30 08:58:00 2009 [TM][D]: Message received: LOG - 76 tm_delete.sh:
> Executed "rm -rf /srv/cloud/one/var/76/images".
> Wed Sep 30 08:58:00 2009 [TM][D]: Message received: TRANSFER SUCCESS 76 -
> Wed Sep 30 08:58:50 2009 [VMM][I]: Monitoring VM 72.
> Wed Sep 30 08:58:50 2009 [VMM][I]: Monitoring VM 73.
> Wed Sep 30 08:58:50 2009 [VMM][I]: Monitoring VM 77.
> Wed Sep 30 08:58:50 2009 [VMM][I]: Monitoring VM 78.
> Wed Sep 30 08:58:50 2009 [VMM][D]: Message received: POLL SUCCESS 72
>  STATE=a USEDMEMORY=1048576
> Wed Sep 30 08:58:50 2009 [VMM][D]: Message received: LOG - 77 Command
> execution fail: virsh dominfo one-77
> Wed Sep 30 08:58:50 2009 [VMM][D]: Message received: LOG - 77 STDERR
> follows.
> Wed Sep 30 08:58:50 2009 [VMM][D]: Message received: LOG - 77 Connecting to
> uri: qemu:///system
> Wed Sep 30 08:58:50 2009 [VMM][D]: Message received: LOG - 77 error: failed
> to get domain 'one-77'
> Wed Sep 30 08:58:50 2009 [VMM][D]: Message received: LOG - 77 error: Domain
> not found
> Wed Sep 30 08:58:50 2009 [VMM][D]: Message received: LOG - 77 ExitCode: 1
> Wed Sep 30 08:58:50 2009 [VMM][D]: Message received: POLL SUCCESS 77 STATE=d
> Wed Sep 30 08:58:50 2009 [VMM][D]: Message received: POLL SUCCESS 73
>  STATE=a USEDMEMORY=1048576
> Wed Sep 30 08:58:50 2009 [VMM][D]: Message received: POLL SUCCESS 78
>  STATE=a USEDMEMORY=1048576
> Wed Sep 30 08:59:02 2009 [InM][I]: Monitoring host node1 (4)
> Wed Sep 30 08:59:06 2009 [InM][D]: Host 4 successfully monitored.
> In my opinion the above log files don't have much more information than VM
> 77 is no longer available.
> Shi
> On Wed, Sep 30, 2009 at 10:02 AM, Ruben S. Montero <rubensm at dacya.ucm.es>
> wrote:
>> Hi,
>>
>> Does the 77 vm.log say something relevant? Can you send us also the
>> vm.log file for 76? OpenNebula should never try to delete VM 77...
>>
>> Thanks
>>
>> Ruben
>>
>> On Wed, Sep 30, 2009 at 5:28 PM, Shi Jin <jinzishuai at gmail.com> wrote:
>>> Hi there,
>>>
>>> I am running one-1.4beta with KVM hypervisor.
>>> Recently I found a very bizarre behavior of "onevm delete".
>>> For example, initially there are 5 VMs running:72,73,76,77,78.
>>> Then I did "onevm delete 76" and found only 3 VMs are left: the VM 77
>>> is also deleted.
>>> "virsh list" on the host node shows VM 77 is gone and "onevm list"
>>> shows its status to be unknown.
>>> I think this is a serious bug and requires a quick fix.
>>> I tried to dig into the OpenNebula log files but found nothing other
>>> than " Command execution fail: virsh dominfo one-77" after the "onevm
>>> delete 76" command.
>>>
>>> On the host node, the syslog shows something interesting:
>>> Sep 30 08:58:59 node1 kernel: [238101.304619] br1: port 5(vnet4)
>>> entering disabled state
>>> Sep 30 08:58:59 node1 kernel: [238101.342851] device vnet4 left
>>> promiscuous mode
>>> Sep 30 08:58:59 node1 kernel: [238101.342854] br1: port 5(vnet4)
>>> entering disabled state
>>> Sep 30 08:58:59 node1 kernel: [238101.424851] br1: port 4(vnet3)
>>> entering disabled state
>>> Sep 30 08:59:00 node1 kernel: [238101.463031] device vnet3 left
>>> promiscuous mode
>>> Sep 30 08:59:00 node1 kernel: [238101.463035] br1: port 4(vnet3)
>>> entering disabled state
>>> Sep 30 08:59:53 node1 libvirtd: 08:59:53.480: error : Domain not found
>>>
>>> Please note that the VM 76,77,78 all use the same vnet (bridged on
>>> br1) and their corresponding ports are vnet3,vnet4 and vnet5.  I am
>>> not exactly sure what  the above message implies but I have a feeling
>>> that something went wrong about the network. In this sense, this
>>> problem may not be a OpenNebula bug, possibly a KVM bug too.
>>>
>>> Actually this kind of problem has happened many times for us. This is
>>> my first time to carefully record what has happened.
>>> I am wondering if anyone has encountered this kind of problem before
>>> and is there a fix?
>>>
>>> Thanks a lot.
>>>
>>> Shi
>>> --
>>> Shi Jin, Ph.D.
>>> _______________________________________________
>>> Users mailing list
>>> Users at lists.opennebula.org
>>> http://lists.opennebula.org/listinfo.cgi/users-opennebula.org
>>>
>>
>>
>>
>> --
>> +---------------------------------------------------------------+
>>  Dr. Ruben Santiago Montero
>>  Associate Professor
>>  Distributed System Architecture Group (http://dsa-research.org)
>>
>>  URL:    http://dsa-research.org/doku.php?id=people:ruben
>>  Weblog: http://blog.dsa-research.org/?author=7
>>
>>  GridWay, http://www.gridway.org
>>  OpenNebula, http://www.opennebula.org
>> +---------------------------------------------------------------+
>>
>
>
>
> --
> Shi Jin, Ph.D.
>
>
>
> _______________________________________________
> Users mailing list
> Users at lists.opennebula.org
> http://lists.opennebula.org/listinfo.cgi/users-opennebula.org
>
>



More information about the Users mailing list