[one-users] "onevm delete" deletes extra VM

Shi Jin jinzishuai at gmail.com
Wed Sep 30 12:14:22 PDT 2009


OK. I just upgraded my ubuntu 9.04 server to 9.10 alpha 6 so that the
libvirt is upgraded from 0.6.1 to 0.7.0 and the problem is gone. Thanks.

Shi

On Wed, Sep 30, 2009 at 10:27 AM, Shi Jin <jinzishuai at gmail.com> wrote:

> OK. With another designed experiment I am quite sure that the problem is
> within KVM/libvirtd rather OpenNebula. Please see the following, which has
> nothing to do with OpenNebula.root at node1:/srv/cloud/ImgRep/WinXP# virsh
> list
> Connecting to uri: qemu:///system
>  Id Name                 State
> ----------------------------------
>  27 one-80               running
>  28 one-81               running
>  29 one-83               running
>  30 one-82               running
>  31 one-79               running
>
> root at node1:/srv/cloud/ImgRep/WinXP# virsh destroy 29
> Connecting to uri: qemu:///system
> Domain 29 destroyed
>
> root at node1:/srv/cloud/ImgRep/WinXP# virsh list
> Connecting to uri: qemu:///system
>  Id Name                 State
> ----------------------------------
>  27 one-80               running
>  28 one-81               running
>  31 one-79               running
>
> The /var/log/message shows
> Sep 30 08:58:59 node1 kernel: [238101.304619] br1: port 5(vnet4) entering
> disabled state
> Sep 30 08:58:59 node1 kernel: [238101.342851] device vnet4 left promiscuous
> mode
> Sep 30 08:58:59 node1 kernel: [238101.342854] br1: port 5(vnet4) entering
> disabled state
> Sep 30 08:58:59 node1 kernel: [238101.424851] br1: port 4(vnet3) entering
> disabled state
> Sep 30 08:59:00 node1 kernel: [238101.463031] device vnet3 left promiscuous
> mode
> Sep 30 08:59:00 node1 kernel: [238101.463035] br1: port 4(vnet3) entering
> disabled state
> Sep 30 09:10:53 node1 -- MARK --
> Sep 30 09:30:53 node1 -- MARK --
> Sep 30 09:50:53 node1 -- MARK --
> Sep 30 09:51:31 node1 kernel: [241253.266955] device vnet3 entered
> promiscuous mode
> Sep 30 09:51:31 node1 kernel: [241253.268857] br1: port 4(vnet3) entering
> learning state
> Sep 30 09:51:40 node1 kernel: [241262.260147] br1: topology change
> detected, propagating
> Sep 30 09:51:40 node1 kernel: [241262.260152] br1: port 4(vnet3) entering
> forwarding state
> Sep 30 09:53:59 node1 kernel: [241401.204570] br1: port 4(vnet3) entering
> disabled state
> Sep 30 09:53:59 node1 kernel: [241401.232395] device vnet3 left promiscuous
> mode
> Sep 30 09:53:59 node1 kernel: [241401.232399] br1: port 4(vnet3) entering
> disabled state
> Sep 30 09:54:20 node1 kernel: [241421.694803] br1: port 2(vnet0) entering
> disabled state
> Sep 30 09:54:20 node1 kernel: [241421.733095] device vnet0 left promiscuous
> mode
> Sep 30 09:54:20 node1 kernel: [241421.733099] br1: port 2(vnet0) entering
> disabled state
> Sep 30 09:54:20 node1 kernel: [241421.815209] br2: port 2(vnet1) entering
> disabled state
> Sep 30 09:54:20 node1 kernel: [241421.853728] device vnet1 left promiscuous
> mode
> Sep 30 09:54:20 node1 kernel: [241421.853732] br2: port 2(vnet1) entering
> disabled state
> Sep 30 09:54:20 node1 kernel: [241421.935254] br1: port 3(vnet2) entering
> disabled state
> Sep 30 09:54:20 node1 kernel: [241421.963527] device vnet2 left promiscuous
> mode
> Sep 30 09:54:20 node1 kernel: [241421.963531] br1: port 3(vnet2) entering
> disabled state
> Sep 30 09:55:29 node1 kernel: [241490.644204] br1: port 6(vnet5) entering
> disabled state
> Sep 30 09:55:29 node1 kernel: [241490.673195] device vnet5 left promiscuous
> mode
> Sep 30 09:55:29 node1 kernel: [241490.673199] br1: port 6(vnet5) entering
> disabled state
> Sep 30 10:10:53 node1 -- MARK --
> Sep 30 10:11:49 node1 kernel: [242470.955142] device vnet0 entered
> promiscuous mode
> Sep 30 10:11:49 node1 kernel: [242470.957051] br1: port 2(vnet0) entering
> learning state
> Sep 30 10:11:52 node1 kernel: [242474.180036] device vnet1 entered
> promiscuous mode
> Sep 30 10:11:52 node1 kernel: [242474.181943] br1: port 3(vnet1) entering
> learning state
> Sep 30 10:11:55 node1 kernel: [242477.431693] device vnet2 entered
> promiscuous mode
> Sep 30 10:11:55 node1 kernel: [242477.433558] br1: port 4(vnet2) entering
> learning state
> Sep 30 10:11:58 node1 kernel: [242479.950082] br1: topology change
> detected, propagating
> Sep 30 10:11:58 node1 kernel: [242479.950086] br1: port 2(vnet0) entering
> forwarding state
> Sep 30 10:12:00 node1 kernel: [242482.059146] device vnet3 entered
> promiscuous mode
> Sep 30 10:12:00 node1 kernel: [242482.060330] br1: port 5(vnet3) entering
> learning state
> Sep 30 10:12:01 node1 kernel: [242483.180170] br1: topology change
> detected, propagating
> Sep 30 10:12:01 node1 kernel: [242483.180173] br1: port 3(vnet1) entering
> forwarding state
> Sep 30 10:12:04 node1 kernel: [242485.710948] device vnet4 entered
> promiscuous mode
> Sep 30 10:12:04 node1 kernel: [242485.712109] br1: port 6(vnet4) entering
> learning state
> Sep 30 10:12:04 node1 kernel: [242486.430098] br1: topology change
> detected, propagating
> Sep 30 10:12:04 node1 kernel: [242486.430104] br1: port 4(vnet2) entering
> forwarding state
> Sep 30 10:12:09 node1 kernel: [242491.051857] br1: topology change
> detected, propagating
> Sep 30 10:12:09 node1 kernel: [242491.051862] br1: port 5(vnet3) entering
> forwarding state
> Sep 30 10:12:13 node1 kernel: [242494.711792] br1: topology change
> detected, propagating
> Sep 30 10:12:13 node1 kernel: [242494.711797] br1: port 6(vnet4) entering
> forwarding state
> Sep 30 10:22:00 node1 kernel: [243082.274008] br1: port 4(vnet2) entering
> disabled state
> Sep 30 10:22:00 node1 kernel: [243082.312108] device vnet2 left promiscuous
> mode
> Sep 30 10:22:00 node1 kernel: [243082.312112] br1: port 4(vnet2) entering
> disabled state
> Sep 30 10:22:00 node1 kernel: [243082.365331] br1: port 5(vnet3) entering
> disabled state
> Sep 30 10:22:00 node1 kernel: [243082.393430] device vnet3 left promiscuous
> mode
> Sep 30 10:22:00 node1 kernel: [243082.393433] br1: port 5(vnet3) entering
> disabled state
>
> I have 4 NICs and the ifconfig shows
> root at node1:/srv/cloud/ImgRep/WinXP# ifconfig
> bond0     Link encap:Ethernet  HWaddr 00:24:e8:6b:0d:57
>           inet6 addr: fe80::224:e8ff:fe6b:d57/64 Scope:Link
>           UP BROADCAST RUNNING MASTER MULTICAST  MTU:1500  Metric:1
>           RX packets:7799931 errors:0 dropped:0 overruns:0 frame:0
>           TX packets:6879357 errors:0 dropped:0 overruns:0 carrier:0
>           collisions:0 txqueuelen:0
>           RX bytes:4360166425 (4.3 GB)  TX bytes:4350243659 (4.3 GB)
>
> br0       Link encap:Ethernet  HWaddr 00:24:e8:6b:0d:57
>           inet addr:192.168.1.128  Bcast:192.168.1.255  Mask:255.255.255.0
>           inet6 addr: fe80::224:e8ff:fe6b:d57/64 Scope:Link
>           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>           RX packets:7641913 errors:0 dropped:0 overruns:0 frame:0
>           TX packets:6412260 errors:0 dropped:0 overruns:0 carrier:0
>           collisions:0 txqueuelen:0
>           RX bytes:4169876739 (4.1 GB)  TX bytes:4291182908 (4.2 GB)
>
> br1       Link encap:Ethernet  HWaddr 00:24:e8:6b:0d:5b
>           inet6 addr: fe80::224:e8ff:fe6b:d5b/64 Scope:Link
>           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>           RX packets:152606 errors:0 dropped:0 overruns:0 frame:0
>           TX packets:6 errors:0 dropped:0 overruns:0 carrier:0
>           collisions:0 txqueuelen:0
>           RX bytes:11276026 (11.2 MB)  TX bytes:468 (468.0 B)
>
> br2       Link encap:Ethernet  HWaddr 00:24:e8:6b:0d:5d
>           inet6 addr: fe80::224:e8ff:fe6b:d5d/64 Scope:Link
>           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>           RX packets:152596 errors:0 dropped:0 overruns:0 frame:0
>           TX packets:6 errors:0 dropped:0 overruns:0 carrier:0
>           collisions:0 txqueuelen:0
>           RX bytes:11286132 (11.2 MB)  TX bytes:468 (468.0 B)
>
> eth0      Link encap:Ethernet  HWaddr 00:24:e8:6b:0d:57
>           UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
>           RX packets:3622205 errors:0 dropped:0 overruns:0 frame:0
>           TX packets:3443103 errors:0 dropped:0 overruns:0 carrier:0
>           collisions:0 txqueuelen:1000
>           RX bytes:1758147112 (1.7 GB)  TX bytes:2174389369 (2.1 GB)
>           Interrupt:36 Memory:d6000000-d6012700
>
> eth1      Link encap:Ethernet  HWaddr 00:24:e8:6b:0d:57
>           UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
>           RX packets:4177726 errors:0 dropped:0 overruns:0 frame:0
>           TX packets:3436254 errors:0 dropped:0 overruns:0 carrier:0
>           collisions:0 txqueuelen:1000
>           RX bytes:2602019313 (2.6 GB)  TX bytes:2175854290 (2.1 GB)
>           Interrupt:48 Memory:d8000000-d8012700
>
> eth2      Link encap:Ethernet  HWaddr 00:24:e8:6b:0d:5b
>           inet6 addr: fe80::224:e8ff:fe6b:d5b/64 Scope:Link
>           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>           RX packets:256014 errors:0 dropped:0 overruns:0 frame:0
>           TX packets:91549 errors:0 dropped:0 overruns:0 carrier:0
>           collisions:0 txqueuelen:1000
>           RX bytes:20680139 (20.6 MB)  TX bytes:17317087 (17.3 MB)
>           Interrupt:32 Memory:da000000-da012700
>
> eth3      Link encap:Ethernet  HWaddr 00:24:e8:6b:0d:5d
>           inet6 addr: fe80::224:e8ff:fe6b:d5d/64 Scope:Link
>           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>           RX packets:245189 errors:0 dropped:0 overruns:0 frame:0
>           TX packets:32669 errors:0 dropped:0 overruns:0 carrier:0
>           collisions:0 txqueuelen:1000
>           RX bytes:20866257 (20.8 MB)  TX bytes:8265329 (8.2 MB)
>           Interrupt:42 Memory:dc000000-dc012700
>
> lo        Link encap:Local Loopback
>           inet addr:127.0.0.1  Mask:255.0.0.0
>           inet6 addr: ::1/128 Scope:Host
>           UP LOOPBACK RUNNING  MTU:16436  Metric:1
>           RX packets:324787 errors:0 dropped:0 overruns:0 frame:0
>           TX packets:324787 errors:0 dropped:0 overruns:0 carrier:0
>           collisions:0 txqueuelen:0
>           RX bytes:571707504 (571.7 MB)  TX bytes:571707504 (571.7 MB)
>
> vnet0     Link encap:Ethernet  HWaddr 52:c3:a3:d0:09:84
>           inet6 addr: fe80::50c3:a3ff:fed0:984/64 Scope:Link
>           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>           RX packets:72 errors:0 dropped:0 overruns:0 frame:0
>           TX packets:533 errors:0 dropped:0 overruns:0 carrier:0
>           collisions:0 txqueuelen:500
>           RX bytes:10410 (10.4 KB)  TX bytes:49306 (49.3 KB)
>
> vnet1     Link encap:Ethernet  HWaddr 02:0d:4c:1e:5c:82
>           inet6 addr: fe80::d:4cff:fe1e:5c82/64 Scope:Link
>           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>           RX packets:36 errors:0 dropped:0 overruns:0 frame:0
>           TX packets:563 errors:0 dropped:0 overruns:0 carrier:0
>           collisions:0 txqueuelen:500
>           RX bytes:6023 (6.0 KB)  TX bytes:53245 (53.2 KB)
>
> vnet4     Link encap:Ethernet  HWaddr a6:58:67:2b:37:8a
>           inet6 addr: fe80::a458:67ff:fe2b:378a/64 Scope:Link
>           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>           RX packets:42 errors:0 dropped:0 overruns:0 frame:0
>           TX packets:552 errors:0 dropped:0 overruns:0 carrier:0
>           collisions:0 txqueuelen:500
>           RX bytes:6733 (6.7 KB)  TX bytes:52189 (52.1 KB)
> Note that there is no IP address assigned to the br1 device which is used
> here. I don't think that should be a problem.
>
> Shi
>
> On Wed, Sep 30, 2009 at 10:09 AM, Shi Jin <jinzishuai at gmail.com> wrote:
>
>> *Here is the vm.log for 76:*
>> Wed Sep 30 08:57:50 2009 [VMM][D]: Monitor Information:
>>        CPU   : -1
>>        Memory: 1048576
>>        Net_TX: -1
>>        Net_RX: -1
>> Wed Sep 30 08:57:56 2009 [DiM][I]: New VM state is DONE
>> Wed Sep 30 08:57:56 2009 [VMM][W]: Ignored: LOG - 76 Driver command for 76
>> cancelled
>>
>> Wed Sep 30 08:57:56 2009 [VMM][W]: Ignored: CANCEL SUCCESS 76 -
>>
>> Wed Sep 30 08:58:00 2009 [TM][W]: Ignored: LOG - 76 tm_delete.sh: Deleting
>> /srv/cloud/one/var/76/images
>>
>> Wed Sep 30 08:58:00 2009 [TM][W]: Ignored: LOG - 76 tm_delete.sh: Executed
>> "rm -rf /srv/cloud/one/var/76/images".
>>
>> Wed Sep 30 08:58:00 2009 [TM][W]: Ignored: TRANSFER SUCCESS 76 -
>>
>> *And vm.log for 77:*
>> Wed Sep 30 08:57:50 2009 [VMM][D]: Monitor Information:
>>        CPU   : -1
>>        Memory: 1048576
>>        Net_TX: -1
>>        Net_RX: -1
>> Wed Sep 30 08:58:50 2009 [VMM][I]: Command execution fail: virsh dominfo
>> one-77
>> Wed Sep 30 08:58:50 2009 [VMM][I]: STDERR follows.
>> Wed Sep 30 08:58:50 2009 [VMM][I]: Connecting to uri: qemu:///system
>> Wed Sep 30 08:58:50 2009 [VMM][I]: error: failed to get domain 'one-77'
>> Wed Sep 30 08:58:50 2009 [VMM][I]: error: Domain not found
>> Wed Sep 30 08:58:50 2009 [VMM][I]: ExitCode: 1
>> Wed Sep 30 08:58:50 2009 [VMM][I]: VM running but it was not found.
>> Restart and delete actions available or try to recover it manually
>> Wed Sep 30 08:58:50 2009 [LCM][I]: New VM state is UNKNOWN
>> Wed Sep 30 08:59:50 2009 [VMM][I]: Command execution fail: virsh dominfo
>> one-77
>> Wed Sep 30 08:59:50 2009 [VMM][I]: STDERR follows.
>> Wed Sep 30 08:59:50 2009 [VMM][I]: Connecting to uri: qemu:///system
>> Wed Sep 30 08:59:50 2009 [VMM][I]: error: failed to get domain 'one-77'
>> Wed Sep 30 08:59:50 2009 [VMM][I]: error: Domain not found
>> Wed Sep 30 08:59:50 2009 [VMM][I]: ExitCode: 1
>> Wed Sep 30 08:59:50 2009 [VMM][I]: VM running but it was not found.
>> Restart and delete actions available or try to recover it manually
>>
>> *And oned.log:*
>> Wed Sep 30 08:57:55 2009 [ReM][D]: VirtualMachineAction invoked
>> Wed Sep 30 08:57:55 2009 [DiM][D]: Finalizing VM 76
>> Wed Sep 30 08:57:56 2009 [VMM][D]: Message received: LOG - 76 Driver
>> command for 76 cancelled
>>
>> Wed Sep 30 08:57:56 2009 [VMM][D]: Message received: CANCEL SUCCESS 76 -
>>
>> Wed Sep 30 08:57:58 2009 [ReM][D]: VirtualMachinePoolInfo method invoked
>> Wed Sep 30 08:58:00 2009 [TM][D]: Message received: LOG - 76 tm_delete.sh:
>> Deleting /srv/cloud/one/var/76/images
>>
>> Wed Sep 30 08:58:00 2009 [TM][D]: Message received: LOG - 76 tm_delete.sh:
>> Executed "rm -rf /srv/cloud/one/var/76/images".
>>
>> Wed Sep 30 08:58:00 2009 [TM][D]: Message received: TRANSFER SUCCESS 76 -
>>
>> Wed Sep 30 08:58:50 2009 [VMM][I]: Monitoring VM 72.
>> Wed Sep 30 08:58:50 2009 [VMM][I]: Monitoring VM 73.
>> Wed Sep 30 08:58:50 2009 [VMM][I]: Monitoring VM 77.
>> Wed Sep 30 08:58:50 2009 [VMM][I]: Monitoring VM 78.
>> Wed Sep 30 08:58:50 2009 [VMM][D]: Message received: POLL SUCCESS 72
>>  STATE=a USEDMEMORY=1048576
>>
>> Wed Sep 30 08:58:50 2009 [VMM][D]: Message received: LOG - 77 Command
>> execution fail: virsh dominfo one-77
>>
>> Wed Sep 30 08:58:50 2009 [VMM][D]: Message received: LOG - 77 STDERR
>> follows.
>>
>> Wed Sep 30 08:58:50 2009 [VMM][D]: Message received: LOG - 77 Connecting
>> to uri: qemu:///system
>>
>> Wed Sep 30 08:58:50 2009 [VMM][D]: Message received: LOG - 77 error:
>> failed to get domain 'one-77'
>>
>> Wed Sep 30 08:58:50 2009 [VMM][D]: Message received: LOG - 77 error:
>> Domain not found
>>
>> Wed Sep 30 08:58:50 2009 [VMM][D]: Message received: LOG - 77 ExitCode: 1
>>
>> Wed Sep 30 08:58:50 2009 [VMM][D]: Message received: POLL SUCCESS 77
>> STATE=d
>> Wed Sep 30 08:58:50 2009 [VMM][D]: Message received: POLL SUCCESS 73
>>  STATE=a USEDMEMORY=1048576
>>
>> Wed Sep 30 08:58:50 2009 [VMM][D]: Message received: POLL SUCCESS 78
>>  STATE=a USEDMEMORY=1048576
>>
>> Wed Sep 30 08:59:02 2009 [InM][I]: Monitoring host node1 (4)
>> Wed Sep 30 08:59:06 2009 [InM][D]: Host 4 successfully monitored.
>>
>> *In my opinion the above log files don't have much more information than
>> VM 77 is no longer available.*
>> *
>> *
>> *Shi*
>>
>> On Wed, Sep 30, 2009 at 10:02 AM, Ruben S. Montero <rubensm at dacya.ucm.es>
>> wrote:
>> > Hi,
>> >
>> > Does the 77 vm.log say something relevant? Can you send us also the
>> > vm.log file for 76? OpenNebula should never try to delete VM 77...
>> >
>> > Thanks
>> >
>> > Ruben
>> >
>> > On Wed, Sep 30, 2009 at 5:28 PM, Shi Jin <jinzishuai at gmail.com> wrote:
>> >> Hi there,
>> >>
>> >> I am running one-1.4beta with KVM hypervisor.
>> >> Recently I found a very bizarre behavior of "onevm delete".
>> >> For example, initially there are 5 VMs running:72,73,76,77,78.
>> >> Then I did "onevm delete 76" and found only 3 VMs are left: the VM 77
>> >> is also deleted.
>> >> "virsh list" on the host node shows VM 77 is gone and "onevm list"
>> >> shows its status to be unknown.
>> >> I think this is a serious bug and requires a quick fix.
>> >> I tried to dig into the OpenNebula log files but found nothing other
>> >> than " Command execution fail: virsh dominfo one-77" after the "onevm
>> >> delete 76" command.
>> >>
>> >> On the host node, the syslog shows something interesting:
>> >> Sep 30 08:58:59 node1 kernel: [238101.304619] br1: port 5(vnet4)
>> >> entering disabled state
>> >> Sep 30 08:58:59 node1 kernel: [238101.342851] device vnet4 left
>> promiscuous mode
>> >> Sep 30 08:58:59 node1 kernel: [238101.342854] br1: port 5(vnet4)
>> >> entering disabled state
>> >> Sep 30 08:58:59 node1 kernel: [238101.424851] br1: port 4(vnet3)
>> >> entering disabled state
>> >> Sep 30 08:59:00 node1 kernel: [238101.463031] device vnet3 left
>> promiscuous mode
>> >> Sep 30 08:59:00 node1 kernel: [238101.463035] br1: port 4(vnet3)
>> >> entering disabled state
>> >> Sep 30 08:59:53 node1 libvirtd: 08:59:53.480: error : Domain not found
>> >>
>> >> Please note that the VM 76,77,78 all use the same vnet (bridged on
>> >> br1) and their corresponding ports are vnet3,vnet4 and vnet5.  I am
>> >> not exactly sure what  the above message implies but I have a feeling
>> >> that something went wrong about the network. In this sense, this
>> >> problem may not be a OpenNebula bug, possibly a KVM bug too.
>> >>
>> >> Actually this kind of problem has happened many times for us. This is
>> >> my first time to carefully record what has happened.
>> >> I am wondering if anyone has encountered this kind of problem before
>> >> and is there a fix?
>> >>
>> >> Thanks a lot.
>> >>
>> >> Shi
>> >> --
>> >> Shi Jin, Ph.D.
>> >> _______________________________________________
>> >> Users mailing list
>> >> Users at lists.opennebula.org
>> >> http://lists.opennebula.org/listinfo.cgi/users-opennebula.org
>> >>
>> >
>> >
>> >
>> > --
>> > +---------------------------------------------------------------+
>> >  Dr. Ruben Santiago Montero
>> >  Associate Professor
>> >  Distributed System Architecture Group (http://dsa-research.org)
>> >
>> >  URL:    http://dsa-research.org/doku.php?id=people:ruben
>> >  Weblog: http://blog.dsa-research.org/?author=7
>> >
>> >  GridWay, http://www.gridway.org
>> >  OpenNebula, http://www.opennebula.org
>> > +---------------------------------------------------------------+
>> >
>>
>>
>>
>> --
>> Shi Jin, Ph.D.
>>
>>
>>
>
>
> --
> Shi Jin, Ph.D.
>
>


-- 
Shi Jin, Ph.D.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.opennebula.org/pipermail/users-opennebula.org/attachments/20090930/69b3e53b/attachment-0001.htm>


More information about the Users mailing list