[one-users] "onevm delete" deletes extra VM

Shi Jin jinzishuai at gmail.com
Wed Sep 30 09:09:08 PDT 2009


*Here is the vm.log for 76:*
Wed Sep 30 08:57:50 2009 [VMM][D]: Monitor Information:
       CPU   : -1
       Memory: 1048576
       Net_TX: -1
       Net_RX: -1
Wed Sep 30 08:57:56 2009 [DiM][I]: New VM state is DONE
Wed Sep 30 08:57:56 2009 [VMM][W]: Ignored: LOG - 76 Driver command for 76
cancelled

Wed Sep 30 08:57:56 2009 [VMM][W]: Ignored: CANCEL SUCCESS 76 -

Wed Sep 30 08:58:00 2009 [TM][W]: Ignored: LOG - 76 tm_delete.sh: Deleting
/srv/cloud/one/var/76/images

Wed Sep 30 08:58:00 2009 [TM][W]: Ignored: LOG - 76 tm_delete.sh: Executed
"rm -rf /srv/cloud/one/var/76/images".

Wed Sep 30 08:58:00 2009 [TM][W]: Ignored: TRANSFER SUCCESS 76 -

*And vm.log for 77:*
Wed Sep 30 08:57:50 2009 [VMM][D]: Monitor Information:
       CPU   : -1
       Memory: 1048576
       Net_TX: -1
       Net_RX: -1
Wed Sep 30 08:58:50 2009 [VMM][I]: Command execution fail: virsh dominfo
one-77
Wed Sep 30 08:58:50 2009 [VMM][I]: STDERR follows.
Wed Sep 30 08:58:50 2009 [VMM][I]: Connecting to uri: qemu:///system
Wed Sep 30 08:58:50 2009 [VMM][I]: error: failed to get domain 'one-77'
Wed Sep 30 08:58:50 2009 [VMM][I]: error: Domain not found
Wed Sep 30 08:58:50 2009 [VMM][I]: ExitCode: 1
Wed Sep 30 08:58:50 2009 [VMM][I]: VM running but it was not found. Restart
and delete actions available or try to recover it manually
Wed Sep 30 08:58:50 2009 [LCM][I]: New VM state is UNKNOWN
Wed Sep 30 08:59:50 2009 [VMM][I]: Command execution fail: virsh dominfo
one-77
Wed Sep 30 08:59:50 2009 [VMM][I]: STDERR follows.
Wed Sep 30 08:59:50 2009 [VMM][I]: Connecting to uri: qemu:///system
Wed Sep 30 08:59:50 2009 [VMM][I]: error: failed to get domain 'one-77'
Wed Sep 30 08:59:50 2009 [VMM][I]: error: Domain not found
Wed Sep 30 08:59:50 2009 [VMM][I]: ExitCode: 1
Wed Sep 30 08:59:50 2009 [VMM][I]: VM running but it was not found. Restart
and delete actions available or try to recover it manually

*And oned.log:*
Wed Sep 30 08:57:55 2009 [ReM][D]: VirtualMachineAction invoked
Wed Sep 30 08:57:55 2009 [DiM][D]: Finalizing VM 76
Wed Sep 30 08:57:56 2009 [VMM][D]: Message received: LOG - 76 Driver command
for 76 cancelled

Wed Sep 30 08:57:56 2009 [VMM][D]: Message received: CANCEL SUCCESS 76 -

Wed Sep 30 08:57:58 2009 [ReM][D]: VirtualMachinePoolInfo method invoked
Wed Sep 30 08:58:00 2009 [TM][D]: Message received: LOG - 76 tm_delete.sh:
Deleting /srv/cloud/one/var/76/images

Wed Sep 30 08:58:00 2009 [TM][D]: Message received: LOG - 76 tm_delete.sh:
Executed "rm -rf /srv/cloud/one/var/76/images".

Wed Sep 30 08:58:00 2009 [TM][D]: Message received: TRANSFER SUCCESS 76 -

Wed Sep 30 08:58:50 2009 [VMM][I]: Monitoring VM 72.
Wed Sep 30 08:58:50 2009 [VMM][I]: Monitoring VM 73.
Wed Sep 30 08:58:50 2009 [VMM][I]: Monitoring VM 77.
Wed Sep 30 08:58:50 2009 [VMM][I]: Monitoring VM 78.
Wed Sep 30 08:58:50 2009 [VMM][D]: Message received: POLL SUCCESS 72
 STATE=a USEDMEMORY=1048576

Wed Sep 30 08:58:50 2009 [VMM][D]: Message received: LOG - 77 Command
execution fail: virsh dominfo one-77

Wed Sep 30 08:58:50 2009 [VMM][D]: Message received: LOG - 77 STDERR
follows.

Wed Sep 30 08:58:50 2009 [VMM][D]: Message received: LOG - 77 Connecting to
uri: qemu:///system

Wed Sep 30 08:58:50 2009 [VMM][D]: Message received: LOG - 77 error: failed
to get domain 'one-77'

Wed Sep 30 08:58:50 2009 [VMM][D]: Message received: LOG - 77 error: Domain
not found

Wed Sep 30 08:58:50 2009 [VMM][D]: Message received: LOG - 77 ExitCode: 1

Wed Sep 30 08:58:50 2009 [VMM][D]: Message received: POLL SUCCESS 77 STATE=d
Wed Sep 30 08:58:50 2009 [VMM][D]: Message received: POLL SUCCESS 73
 STATE=a USEDMEMORY=1048576

Wed Sep 30 08:58:50 2009 [VMM][D]: Message received: POLL SUCCESS 78
 STATE=a USEDMEMORY=1048576

Wed Sep 30 08:59:02 2009 [InM][I]: Monitoring host node1 (4)
Wed Sep 30 08:59:06 2009 [InM][D]: Host 4 successfully monitored.

*In my opinion the above log files don't have much more information than VM
77 is no longer available.*
*
*
*Shi*

On Wed, Sep 30, 2009 at 10:02 AM, Ruben S. Montero <rubensm at dacya.ucm.es>
wrote:
> Hi,
>
> Does the 77 vm.log say something relevant? Can you send us also the
> vm.log file for 76? OpenNebula should never try to delete VM 77...
>
> Thanks
>
> Ruben
>
> On Wed, Sep 30, 2009 at 5:28 PM, Shi Jin <jinzishuai at gmail.com> wrote:
>> Hi there,
>>
>> I am running one-1.4beta with KVM hypervisor.
>> Recently I found a very bizarre behavior of "onevm delete".
>> For example, initially there are 5 VMs running:72,73,76,77,78.
>> Then I did "onevm delete 76" and found only 3 VMs are left: the VM 77
>> is also deleted.
>> "virsh list" on the host node shows VM 77 is gone and "onevm list"
>> shows its status to be unknown.
>> I think this is a serious bug and requires a quick fix.
>> I tried to dig into the OpenNebula log files but found nothing other
>> than " Command execution fail: virsh dominfo one-77" after the "onevm
>> delete 76" command.
>>
>> On the host node, the syslog shows something interesting:
>> Sep 30 08:58:59 node1 kernel: [238101.304619] br1: port 5(vnet4)
>> entering disabled state
>> Sep 30 08:58:59 node1 kernel: [238101.342851] device vnet4 left
promiscuous mode
>> Sep 30 08:58:59 node1 kernel: [238101.342854] br1: port 5(vnet4)
>> entering disabled state
>> Sep 30 08:58:59 node1 kernel: [238101.424851] br1: port 4(vnet3)
>> entering disabled state
>> Sep 30 08:59:00 node1 kernel: [238101.463031] device vnet3 left
promiscuous mode
>> Sep 30 08:59:00 node1 kernel: [238101.463035] br1: port 4(vnet3)
>> entering disabled state
>> Sep 30 08:59:53 node1 libvirtd: 08:59:53.480: error : Domain not found
>>
>> Please note that the VM 76,77,78 all use the same vnet (bridged on
>> br1) and their corresponding ports are vnet3,vnet4 and vnet5.  I am
>> not exactly sure what  the above message implies but I have a feeling
>> that something went wrong about the network. In this sense, this
>> problem may not be a OpenNebula bug, possibly a KVM bug too.
>>
>> Actually this kind of problem has happened many times for us. This is
>> my first time to carefully record what has happened.
>> I am wondering if anyone has encountered this kind of problem before
>> and is there a fix?
>>
>> Thanks a lot.
>>
>> Shi
>> --
>> Shi Jin, Ph.D.
>> _______________________________________________
>> Users mailing list
>> Users at lists.opennebula.org
>> http://lists.opennebula.org/listinfo.cgi/users-opennebula.org
>>
>
>
>
> --
> +---------------------------------------------------------------+
>  Dr. Ruben Santiago Montero
>  Associate Professor
>  Distributed System Architecture Group (http://dsa-research.org)
>
>  URL:    http://dsa-research.org/doku.php?id=people:ruben
>  Weblog: http://blog.dsa-research.org/?author=7
>
>  GridWay, http://www.gridway.org
>  OpenNebula, http://www.opennebula.org
> +---------------------------------------------------------------+
>



-- 
Shi Jin, Ph.D.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.opennebula.org/pipermail/users-opennebula.org/attachments/20090930/79540437/attachment-0001.htm>


More information about the Users mailing list