[one-users] Very high unavailable service

Sun Aug 26 12:34:06 PDT 2012

Hi

If you want to try the cpu pinning suggested by Steven, simply add a
RAW attribute in your VM template. Something similar to:

RAW=[
  TYPE="kvm",
  DATA="<cputune><vcpupin vcpu=\"0\" cpuset=\"1\"/></cputune>" ]

Cheers

Ruben

On Sun, Aug 26, 2012 at 3:27 AM, Steven C Timm <timm at fnal.gov> wrote:
> I run high-availability squid servers on virtual machines although not yet
> in OpenNebula.
>
> It can be done with very high availability.
>
> I am not familiar with Ubuntu Server 12.04 but if it has libvirt 0.9.7 or
> better, and you are
>
> Using KVM hypervisor, you should be able to use the cpu-pinning and
> numa-aware features of libvirt to pin
>
> each virtual machine to a given physical cpu.   That will beat the migration
> issue you are seeing now.
>
> With Xen hypervisor you can (and should) also pin.
>
> I think if you beat the cpu and memory pinning problem you will be OK.
>
>
>
> However, you did not say what network topology you are using for your
> virtual machine, and what kind of virtual network drivers,
>
> That is important too.    Also—is your squid cache mostly disk-resident or
> mostly RAM-resident?  If the former then the virtual disk drivers matter
> too, a lot.
>
>
>
> Steve Timm
>
>
>
>
>
>
>
> From: users-bounces at lists.opennebula.org
> [mailto:users-bounces at lists.opennebula.org] On Behalf Of Erico Augusto
> Cavalcanti Guedes
> Sent: Saturday, August 25, 2012 6:33 PM
> To: users at lists.opennebula.org
> Subject: [one-users] Very high unavailable service
>
>
>
> Dears,
>
> I 'm running Squid Web Cache Proxy server on Ubuntu Server 12.04 VMs (kernel
> 3.2.0-23-generic-pae), OpenNebula 3.4.
> My private cloud is composed by one frontend and three nodes. VMs are
> running on that 3 nodes, initially one by node.
> Outside cloud, there are 2 hosts, one working as web clients and another as
> web server, using Web Polygraph Benchmakring Tool.
>
> The goal of tests is stress Squid cache running on VMs.
> When same test is executed outside the cloud, using the three nodes as
> Physical Machines, there are 100% of cache service availability.
> Nevertheless, when cache service is provided by VMs, nothing better than 45%
> of service availability is reached.
> Web clients do not receive responses from squid when it is running on VMs in
> 55% of the time.
>
> I have monitored load average of VMs and PMs where VMs are been executed.
> First load average field reaches 15 after some hours of tests on VMs, and 3
> on physical machines.
> Furthermore, there is a set of processes, called migration/X, that are
> champions in CPU TIME when VMs are in execution. A sample:
>
> top - 20:01:38 up 1 day,  3:36,  1 user,  load average: 5.50, 5.47, 4.20
>
>   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+    TIME COMMAND
>    13 root      RT   0     0    0    0 S    0  0.0 408:27.25 408:27
> migration/2
>     8 root      RT   0     0    0    0 S    0  0.0 404:13.63 404:13
> migration/1
>     6 root      RT   0     0    0    0 S    0  0.0 401:36.78 401:36
> migration/0
>    17 root      RT   0     0    0    0 S    0  0.0 400:59.10 400:59
> migration/3
>
>
> It isn't possible to offer web cache service via VMs in the way the service
> is behaving, with so small availability.
>
> So, my questions:
>
> 1. Does anybody has experienced a similar problem of unresponsive service?
> (Whatever service).
> 2. How to state the bootleneck that is overloading the system, so that it
> can be minimized?
>
> Thanks a lot,
>
> Erico.
>
>
> _______________________________________________
> Users mailing list
> Users at lists.opennebula.org
> http://lists.opennebula.org/listinfo.cgi/users-opennebula.org
>

-- 
Ruben S. Montero, PhD
Project co-Lead and Chief Architect
OpenNebula - The Open Source Solution for Data Center Virtualization
www.OpenNebula.org | rsmontero at opennebula.org | @OpenNebula