[one-users] OpenNebula plus RHEL./Centos/Sci. Linux 6.3 or 6.4

Steven C Timm timm at fnal.gov
Thu Apr 24 19:46:15 PDT 2014


I am wondering if there are any other big OpenNebula clouds out there using RHEL 6.3 or 6.4,
Centos 6.3 or 6.4, or Scientific Linux 6.3 or 6.4?

We are seeing a fairly nasty performance problem, but only on intel-based "Sandy Bridge" or "Ivy Bridge"
based hardware.  If you have N kvm-based virtual machines running (N>=4 as far as I can tell)
and then do a lot of disk and I/O  activity on the hypervisor, for example migrating several more virtual machines to or from the bare metal, and if at least one of those virtual machines is doing some I/O too, there is a failure
mode such that you start seeing sshd processes (from oneadmin monitoring or otherwise) hanging and taking 100%
of CPU. Ping times to virtual machines become very widely varied, in extreme cases the VM can even go
off the network entirely in such a fashion that ifdown/ifup doesn't bring it back and sometimes you can't even kill
it with virsh destroy.  A couple times we have even managed to crash the hypervisor irreversibly so it has to be power cycled.

If all the surviving virtual machines are shut down, the system then returns to normal and all the hung processes exit.

Has anyone else seen problems iike this?  If so please let me know.  There seems to be little if anything out there about this bug and that is strange since it has been out there for a while.

Steve Timm


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.opennebula.org/pipermail/users-opennebula.org/attachments/20140425/dea6ccf4/attachment.htm>


More information about the Users mailing list