[one-users] Running VMs listed as zombies on KVM host?

Carlos Martín Sánchez cmartin at opennebula.org
Fri Nov 21 06:58:58 PST 2014


Since we couldn't find any explanation in the code, I'll close the bug
ticket. Let's chalk it up to this corruption.
Thanks for the feedback

--
Carlos Martín, MSc
Project Engineer
OpenNebula - Flexible Enterprise Cloud Made Simple
www.OpenNebula.org | cmartin at opennebula.org | @OpenNebula
<http://twitter.com/opennebula> <cmartin at opennebula.org>

On Mon, Nov 17, 2014 at 4:48 PM, Dmitri Chebotarov <dchebota at gmu.edu> wrote:

> Sorry for the late response.
>
> I ended up rebuilding trouble nodes as part of RHEL7 upgrade.
>
> I believe the problem was related to absence of proper fencing for HOSTS.
> VMs that are using persistent storage ended up with corrupted filesystem -
> I fixed some disks with fsck, others restored from backup.
>
> I'm looking into how to implement fencing (via power reset) via HOST_HOOK,
> all hosts can be controlled using xCAT and I have xCAT's REST API
> configured, I need to figure out how to execute multiple HOST_HOOK...
>
>
> --
> Thank you,
>
> Dmitri Chebotarov
> VCL Sys Eng, Engineering & Architectural Support, TSD - Ent Servers &
> Messaging
> 223 Aquia Building, Ffx, MSN: 1B5
> Phone: (703) 993-6175 | Fax: (703) 993-3404
>
>
> > On Oct 20, 2014, at 11:30 , Carlos Martín Sánchez <
> cmartin at opennebula.org> wrote:
> >
> > Hi Dmitri,
> >
> > Do you still experience this issue? If so, we could use more information
> to further debug this.
> >
> > Please send the output of onehost show -x.
> > And from the host, the output of this command: cd
> /var/tmp/one/im/kvm-probes.d/; ./poll.sh
> >
> > Thank you.
> >
> > --
> > Carlos Martín, MSc
> > Project Engineer
> > OpenNebula - Flexible Enterprise Cloud Made Simple
> > www.OpenNebula.org | cmartin at opennebula.org | @OpenNebula
> >
> > On Wed, Oct 1, 2014 at 6:30 PM, Carlos Martín Sánchez <
> cmartin at opennebula.org> wrote:
> > Hi,
> >
> > That is strange, so we opened a bug to look into it for the next release:
> > http://dev.opennebula.org/issues/3217
> >
> > Cheers,
> > Carlos
> >
> > --
> > Carlos Martín, MSc
> > Project Engineer
> > OpenNebula - Flexible Enterprise Cloud Made Simple
> > www.OpenNebula.org | cmartin at opennebula.org | @OpenNebula
> >
> > On Tue, Sep 30, 2014 at 5:46 PM, Dmitri Chebotarov <dchebota at gmu.edu>
> wrote:
> > Hi,
> >
> > Something strange is happening with my VMs...
> > Below is 'one host show 161' output and it shows that ONE marks running
> VMs as zombies on the host. It happens to multiple VMs/HOSTs, but now all.
> I tried to delete ZOMBIES attribute, but few minutes later it's back with
> the list of running VMs. What would cause this behavior?
> >
> > (ONE 4.8)
> >
> > $ onehost show 161
> > HOST 161 INFORMATION
> > ID                    : 161
> > NAME                  : BC3-6
> > CLUSTER               : RHEL
> > STATE                 : MONITORED
> > IM_MAD                : kvm
> > VM_MAD                : kvm
> > VN_MAD                : ovswitch
> > LAST MONITORING TIME  : 09/30 11:35:22
> >
> > HOST SHARES
> > TOTAL MEM             : 47.1G
> > USED MEM (REAL)       : 16.9G
> > USED MEM (ALLOCATED)  : 14G
> > TOTAL CPU             : 2400
> > USED CPU (REAL)       : 50
> > USED CPU (ALLOCATED)  : 700
> > RUNNING VMS           : 4
> >
> > MONITORING INFORMATION
> > ARCH="x86_64"
> > ARCH="x86_64"
> > CPUSPEED="1596"
> > CPUSPEED="1596"
> > HOSTNAME="BC3-6"
> > HOSTNAME="BC3-6"
> > HYPERVISOR="kvm"
> > HYPERVISOR="kvm"
> > MODELNAME="Intel(R) Xeon(R) CPU           X5660  @ 2.80GHz"
> > MODELNAME="Intel(R) Xeon(R) CPU           X5660  @ 2.80GHz"
> > NETRX="45361301966553"
> > NETRX="0"
> > NETTX="2465939705538"
> > NETTX="0"
> > RESERVED_CPU=""
> > RESERVED_MEM=""
> > TOTAL_ZOMBIES="4"
> > VERSION="4.8.0"
> > ZOMBIES="one-51701, one-51774, one-51747, one-51763"
> >
> > VIRTUAL MACHINES
> >
> >     ID USER     GROUP    NAME            STAT UCPU    UMEM HOST
>    TIME
> >  51701 vcl-gmu- vcl      vmguest-vcl323  runn   14      4G BC3-6
> 0d 12h40
> >  51747 vcl-gmu- vcl      vmguest-vcl65 ( runn    9      2G BC3-6
> 0d 10h25
> >  51763 vcl-gmu- vcl      vmguest-vcl312  runn   12      4G BC3-6
> 0d 09h10
> >  51774 vcl-gmu- vcl      vmguest-vcl17 ( runn   14      4G BC3-6
> 0d 06h10
> >
> > --
> > Thank you,
> >
> > Dmitri Chebotarov
> > VCL Sys Eng, Engineering & Architectural Support, TSD - Ent Servers &
> Messaging
> > 223 Aquia Building, Ffx, MSN: 1B5
> > Phone: (703) 993-6175 | Fax: (703) 993-3404
> > _______________________________________________
> > Users mailing list
> > Users at lists.opennebula.org
> > http://lists.opennebula.org/listinfo.cgi/users-opennebula.org
> >
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.opennebula.org/pipermail/users-opennebula.org/attachments/20141121/65b4d257/attachment.htm>


More information about the Users mailing list