[one-users] integrating cgroup into OpenNebula-KVM?

Tue Jul 13 11:04:06 PDT 2010

Hi Shin!

As soon as our solution passed the smoke tests I post it here:)

For the current release we just use the cpu subsystem (maybe the cpuacct also in order to
catch runtime stats). For a later release we may also support Net_cls in order to ensure
bandwidth QoS (BTW: the basic idea comes from a Red Hat arcticle [1] which is also a nice
overview on virtualization).

To my understanding overcommiting cpu resources is known to be problematic only for smp
guests. At least the spin lock problem doesn't seem to effect single cpu machines [2]. Some
background stuff... Earlier we decided to use the convention below just for simplicity. The
reason behind: our original plan was to implement a custom scheduler and we didn't want to
deal with complex situtations. So far we switched back to the builtin scheduler, hence we
should revisit our CPU/VCPU policy for single CPU machines. Thanks for your feedback:)

Cheers,
Gyula

---

[1] http://www.redhat.com/f/pdf/rhev/DOC-KVM.pdf

[2] To my understanding the spin lock [3] problem is the following.
* KVM implements VCPUs with Linux threads. When the CPU utilization is near to 100% KVM usually
  schedules the threads of the same smp guest to the same physical CPU. Then the spin lock
  problem means the following:
* One of the VPCPU threads hold a lock (at the guest OS level).
* The other VCPU thread tries to access it with the spin lock technology that is it checks the lock
  frequently but does this without going to sleep.
* Since both VCPU threads lives on the same physical CPU, due to the spin lock the lock holder
  cannot run, so the lock cannot be released. A dead lock situation...

[3] http://en.wikipedia.org/wiki/Spinlock

________________________________
Feladó: Shi Jin [jinzishuai at gmail.com]
Küldve: 2010. július 13. 19:28
Címzett: Csom Gyula
Másolatot kap: opennebula user list
Tárgy: Re: [one-users] integrating cgroup into OpenNebula-KVM?

Thank you very much Gyula.
I am very interested in learning your solutions. So please post it.

Curious to know, what cgroups subsystems are you using.  I am only considering cpu. Are you using anything else, like cpuset or memory?

A note on the CPU overcommiting: do you see a problem in overcommitting single CPU VMs, i.e, multiple small size VMs (vCPU=1) sharing a physical core? Your note seems to suggest the problem is only with SMP guests.
I think this is a very important feature. And without running multiple VMs on a single core, I feel there is not much need for cgroups really.  The current ONE seems good enough if we always set CPU=VCPU in the ONE template. What I wanted to have is CPU=1 while vCPU=0.5 or even 0.25.

Thanks.
Shi

2010/7/13 Csom Gyula <csom at interface.hu<mailto:csom at interface.hu>>
Hi,
regarding ONE plans I have no clue:) Otherwise in our system (currently under development)
we are using cgroups also (especially we are using it in order to guarantee CPU performance
which is required for vms like web application servers or kinda). We are using cgroups in the
following way:

1. The vm cpu number is technically bound to VCPU (both at OpenNebula and libvirt).
2. We are using cpu shares in order to give the proper share.
3. We are using the ONE hook system [1] in order to trigger the cgroups script.

We are using the following convenctions:
4. System share: 90% goes to vms and 10% goes to the system itself.
5. We are not overcommiting cpu resources since KVM has problems with such environments
   [2]:
   * physical cpu number must be equal to the vcpu number
   * the total number of vcpus on a given host cannot exceed the numbers of physical cpus

BTW: Our solution will reach the alpha state this month, if interested I might post it here.

Cheers,
Gyula

---

[1] http://www.opennebula.org/documentation:rel1.4:oned_conf#hook_system
[2] SMP overcommiting problem: http://www.mail-archive.com/kvm@vger.kernel.org/msg32079.html
http://www.mail-archive.com/kvm@vger.kernel.org/msg33739.html. The route cause seems to be
spin locks: they might cause dead locks in SMP systems when overcommitting host resources.
The problem is also named "lock holder preemption", you might find related articles on the web,
for instance: http://www.amd64.org/fileadmin/user_upload/pub/2008-Friebel-LHP-GI_OS.pdf.
<http://www.reservoir-fp7.eu/index.php?mact=News,cntnt01,detail,0&cntnt01articleid=66&cntnt01returnid=108>
________________________________
Feladó: users-bounces at lists.opennebula.org<mailto:users-bounces at lists.opennebula.org> [users-bounces at lists.opennebula.org<mailto:users-bounces at lists.opennebula.org>] ; meghatalmazó: Shi Jin [jinzishuai at gmail.com<mailto:jinzishuai at gmail.com>]
Küldve: 2010. július 13. 1:28
Címzett: opennebula user list
Tárgy: [one-users] integrating cgroup into OpenNebula-KVM?

Hi there,

Redhat is going to include cgroups in the new RHEL-6, which is a great way to do quality of service (QoS) control on the resources, such as VM CPU, memory, network etc.
Especially on CPU power, I remember the OpenNebula template has a variable CPU but it is not really used under KVM but rather a scheduling criteria.
With cgroups, the CPU can have a meaning used to give each VM their proper share of the system computing power.
I wonder if there is any plans to integrate this into OpenNebula.

Thank you very much.

--
Shi Jin, Ph.D.

--
Shi Jin, Ph.D.