[one-users] Making scheduler allocation aware

Shashank Rachamalla shashank.rachamalla at hexagrid.com
Thu Nov 11 23:32:54 PST 2010


The lines

    if(scpu=="")
    {
        scpu="0";
    }

were accidentally introduced by me in the code. please ignore them.

On 12 November 2010 12:56, Shashank Rachamalla <
shashank.rachamalla at hexagrid.com> wrote:

> Hi
>
> As Ruben pointed out, every time a VM is dispatched, host share counter (
> mem_usage ) gets incremented. Thus there is no way for opennebula to allow
> over committing. However, the problem lies in the following piece of code
> from VirtualMachine.cc where cpu, memory and disk are set to 0 when a VM
> template does not contain "CPU" attribute ( this is true in our case ). The
> mem_usage value in host share does not get incremented because of this. I
> guess, specifying "CPU=0" in the template should solve the problem. I will
> reconfirm after testing.
>
> void VirtualMachine::get_requirements (int& cpu, int& memory, int& disk)
> {
>     string          scpu;
>     istringstream   iss;
>     float           fcpu;
>
>     get_template_attribute("MEMORY",memory);
>     get_template_attribute("CPU",scpu);
>
> *    if ((memory == 0) || (scpu==""))
>     {
>         cpu    = 0;
>         memory = 0;
>         disk   = 0;
>
>         return;
>     }
> *
>     if(scpu=="")
>     {
>         scpu="0";
>     }
>
>     iss.str(scpu);
>     iss >> fcpu;
>
>     cpu    = (int) (fcpu * 100);//now in 100%
>     memory = memory * 1024;     //now in bytes
>     disk   = 0;
>
>     return;
>
> }
>
> On 11 November 2010 19:36, Javier Fontan <jfontan at gmail.com> wrote:
>
>> Hello,
>>
>> Are you sure that those are the exact values for the host? OpenNebula
>> counts "real" (from probes) and allocated (from database) memory so
>> that should not happen.
>>
>> Snippet from a onehost show:
>>
>> --8<------
>> USED MEM (REAL)       : 0
>> USED MEM (ALLOCATED)  : 65536
>> ------>8--
>>
>> I am now working on the kvm monitoring and I have noticed another
>> mismatch even with your probe changes. The values stored in the
>> database for total memory should be changed and that's what I am
>> working on.
>>
>> I am connected to irc.freenode.org in channel #opennebula if you want
>> to discuss this further.
>>
>> Bye
>>
>> On Thu, Nov 11, 2010 at 5:20 AM, Shashank Rachamalla
>> <shashank.rachamalla at hexagrid.com> wrote:
>> > Hi Javier
>> >
>> > Thanks for the inputs but I came across another problem while testing:
>> >
>> > If opennebula receives multiple vm requests in a short span of time, the
>> > scheduler might take decisions for all these vms considering the host
>> > monitoring information available from the last monitoring cycle.
>> Ideally,
>> > before processing every pending request,  fresh host monitoring
>> information
>> > has to be taken into account as the previous set of requests might have
>> > already changed the host’s state. This can result in over committing
>> when
>> > host is being used close to its full capacity.
>> >
>> > Is there any workaround which helps the scheduler to overcome the above
>> > problem ?
>> >
>> > steps to reproduce the problem scenario:
>> >
>> > Host 1 : Total memory = 3GB
>> > Host 2 : Total memory = 2GB
>> > Assume Host1 and Host2 have same number of CPU cores. ( Host1 will have
>> a
>> > higher RANK value )
>> >
>> > VM1: memory = 2GB
>> > VM2: memroy = 2GB
>> >
>> > Start VM1 and VM2 immediately one after the other. Both VM1 and VM2 will
>> > come up on Host1.  ( Thus over committing )
>> >
>> > Start VM1 and VM2 with an intermediate delay of 60sec. VM1 will come up
>> on
>> > Host1 and VM2 will come up on Host2. This is true because opennebula
>> would
>> > have fetched a fresh set of host monitoring information in that time.
>> >
>> >
>> > On 4 November 2010 02:04, Javier Fontan <jfontan at gmail.com> wrote:
>> >>
>> >> Hello,
>> >>
>> >> It looks fine to me. I think that taking out the memory the hypervisor
>> >> may be consuming is key to make it work.
>> >>
>> >> Bye
>> >>
>> >> On Wed, Nov 3, 2010 at 8:32 PM, Rangababu Chakravarthula
>> >> <rbabu at hexagrid.com> wrote:
>> >> > Javier
>> >> >
>> >> > Yes we are using KVM and OpenNebula 1.4.
>> >> >
>> >> > We have been having this problem since a long time and we were doing
>> all
>> >> > kinds of validations ourselves before submitting the request to
>> >> > OpenNebula.
>> >> > (there should  be enough memory in the cloud that matches the
>> requested
>> >> > memory & there should be atleast one host that has memory > requested
>> >> > memory
>> >> > )   We had to do those because OpenNebula would schedule to an
>> arbitrary
>> >> > host based on the existing logic it had.
>> >> > So at last we thought that we need to make OpenNebula aware of memory
>> >> > allocated of running VM's on the host and started this discussion.
>> >> >
>> >> > Thanks for taking up this issue as priority. Appreciate it.
>> >> >
>> >> > Shashank came up with this patch to kvm.rb. Please take a look and
>> let
>> >> > us
>> >> > know if that will work until we get a permanent solution.
>> >> >
>> >> >
>> >> >
>> ====================================================================================
>> >> >
>> >> > $mem_allocated_for_running_vms=0
>> >> > for i in `virsh list|grep running|tr -s ' ' ' '|cut -f2 -d' '` do
>> >> >         $dominfo=`virsh dominfo #{i}`
>> >> >         $dominfo.split(/\n/).each{|line|
>> >> >         if line.match('^Max memory')
>> >> >                 $mem_allocated_for_running_vms += line.split("
>> >> > ")[2].strip.to_i
>> >> >         end
>> >> > }
>> >> > end
>> >> >
>> >> > $mem_used_by_base_hypervisor = [some xyz kb that we want to set aside
>> >> > for
>> >> > hypervisor]
>> >> >
>> >> > $free_memory = $total_memory.to_i - (
>> >> > $mem_allocated_for_running_vms.to_i +
>> >> > $mem_used_by_base_hypervisor.to_i )
>> >> >
>> >> >
>> >> >
>> ======================================================================================
>> >> >
>> >> > Ranga
>> >> >
>> >> > On Wed, Nov 3, 2010 at 2:16 PM, Javier Fontan <jfontan at gmail.com>
>> wrote:
>> >> >>
>> >> >> Hello,
>> >> >>
>> >> >> Sorry for the delay in the response.
>> >> >>
>> >> >> It looks that the problem is OpenNebula calculating available
>> memory.
>> >> >> For xen >= 3.2 there is a reliable way to get available memory that
>> is
>> >> >> calling "xm info" and getting "max_free_memory" attribute.
>> >> >> Unfortunately for kvm or xen < 3.2 there is not such attribute. I
>> >> >> suppose you are using kvm as you tell about "free" command.
>> >> >>
>> >> >> I began analyzing the kvm IM probe that gets memory information and
>> >> >> there is a problem on the way to get total memory. Here is how it
>> now
>> >> >> gets memory information:
>> >> >>
>> >> >> TOTALMEMORY: runs virsh info that gets the real physical memory
>> >> >> installed in the machine
>> >> >> FREEMEMORY: runs free command and gets the free column data without
>> >> >> buffers and cache
>> >> >> USEDMEMORY: runs top command and gets used memory from it (this
>> counts
>> >> >> buffers and cache)
>> >> >>
>> >> >> This is a big problem as those values do not match one with another
>> (I
>> >> >> don't really know how I failed to see this before). Here is the
>> >> >> monitoring data from a host without VMs.
>> >> >>
>> >> >> --8<------
>> >> >> TOTALMEMORY=8193988
>> >> >> USEDMEMORY=7819952
>> >> >> FREEMEMORY=7911924
>> >> >> ------>8--
>> >> >>
>> >> >> As you can see it makes no sense at all. Even the TOTALMEMORY that
>> is
>> >> >> got from virsh info is very misleading for oned as the host linux
>> >> >> instance does not have access to all that memory (some is consumed
>> by
>> >> >> the hypervisor itself) as seen calling a free command:
>> >> >>
>> >> >> --8<------
>> >> >>             total       used       free     shared    buffers
>> >> >> cached
>> >> >> Mem:       8193988    7819192     374796          0      64176
>> >> >>  7473992
>> >> >> ------>8--
>> >> >>
>> >> >> I am also copying this text as an issue to solve this problem
>> >> >> http://dev.opennebula.org/issues/388. It is masked to be solved for
>> >> >> 2.0.1 but the change will be compatible with 1.4 as it seems the the
>> >> >> only changed needed is the IM problem.
>> >> >>
>> >> >> I can not offer you an immediate solution but we'll try to come up
>> >> >> with one as soon as possible.
>> >> >>
>> >> >> Bye
>> >> >>
>> >> >> On Wed, Nov 3, 2010 at 7:08 PM, Rangababu Chakravarthula
>> >> >> <rbabu at hexagrid.com> wrote:
>> >> >> > Hello Javier
>> >> >> > Please let us know if you want us to provide more detailed
>> >> >> > information
>> >> >> > with
>> >> >> > examples?
>> >> >> >
>> >> >> > Ranga
>> >> >> >
>> >> >> > On Fri, Oct 29, 2010 at 9:46 AM, Rangababu Chakravarthula
>> >> >> > <rbabu at hexagrid.com> wrote:
>> >> >> >>
>> >> >> >> Javier
>> >> >> >>
>> >> >> >> We saw that VM's were being deployed to the host where the
>> allocated
>> >> >> >> memory of all the VM's was higher than the available memory on
>> the
>> >> >> >> host.
>> >> >> >>
>> >> >> >> We think OpenNebula is executing free command on the host to
>> >> >> >> determine
>> >> >> >> if
>> >> >> >> there is any room and since free would always return the actual
>> >> >> >> memory
>> >> >> >> that
>> >> >> >> is being consumed and not the allocated, opennebula would push
>> the
>> >> >> >> new
>> >> >> >> jobs
>> >> >> >> to the host.
>> >> >> >>
>> >> >> >> That's the reason we want OpenNebula to be aware of memory
>> allocated
>> >> >> >> to
>> >> >> >> the VM's on the host.
>> >> >> >>
>> >> >> >> Ranga
>> >> >> >>
>> >> >> >> On Thu, Oct 28, 2010 at 2:02 PM, Javier Fontan <
>> jfontan at gmail.com>
>> >> >> >> wrote:
>> >> >> >>>
>> >> >> >>> Hello,
>> >> >> >>>
>> >> >> >>> Could you describe the problem you had? By default the scheduler
>> >> >> >>> will
>> >> >> >>> not overcommit cpu nor memory.
>> >> >> >>>
>> >> >> >>> Bye
>> >> >> >>>
>> >> >> >>> On Thu, Oct 28, 2010 at 4:50 AM, Shashank Rachamalla
>> >> >> >>> <shashank.rachamalla at hexagrid.com> wrote:
>> >> >> >>> > Hi
>> >> >> >>> >
>> >> >> >>> > We have a requirement where in the scheduler should not allow
>> >> >> >>> > memory
>> >> >> >>> > over
>> >> >> >>> > committing while choosing a host for new vm. In order to
>> achieve
>> >> >> >>> > this,
>> >> >> >>> > we
>> >> >> >>> > have changed the way in which FREEMEMORY is being calculated
>> for
>> >> >> >>> > each
>> >> >> >>> > host:
>> >> >> >>> >
>> >> >> >>> > FREE MEMORY = TOTAL MEMORY -  [ Sum of memory values allocated
>> to
>> >> >> >>> > VMs
>> >> >> >>> > which
>> >> >> >>> > are currently running on the host ]
>> >> >> >>> >
>> >> >> >>> > Please let us know if the above approach is fine or is there
>> any
>> >> >> >>> > better
>> >> >> >>> > way
>> >> >> >>> > to accomplish the task. We are using opennebula 1.4.
>> >> >> >>> >
>> >> >> >>> > --
>> >> >> >>> > Regards,
>> >> >> >>> > Shashank Rachamalla
>> >> >> >>> >
>> >> >> >>> > _______________________________________________
>> >> >> >>> > Users mailing list
>> >> >> >>> > Users at lists.opennebula.org
>> >> >> >>> > http://lists.opennebula.org/listinfo.cgi/users-opennebula.org
>> >> >> >>> >
>> >> >> >>> >
>> >> >> >>>
>> >> >> >>>
>> >> >> >>>
>> >> >> >>> --
>> >> >> >>> Javier Fontan, Grid & Virtualization Technology
>> Engineer/Researcher
>> >> >> >>> DSA Research Group: http://dsa-research.org
>> >> >> >>> Globus GridWay Metascheduler: http://www.GridWay.org
>> >> >> >>> OpenNebula Virtual Infrastructure Engine:
>> http://www.OpenNebula.org
>> >> >> >>> _______________________________________________
>> >> >> >>> Users mailing list
>> >> >> >>> Users at lists.opennebula.org
>> >> >> >>> http://lists.opennebula.org/listinfo.cgi/users-opennebula.org
>> >> >> >>
>> >> >> >
>> >> >> >
>> >> >>
>> >> >>
>> >> >>
>> >> >> --
>> >> >> Javier Fontan, Grid & Virtualization Technology Engineer/Researcher
>> >> >> DSA Research Group: http://dsa-research.org
>> >> >> Globus GridWay Metascheduler: http://www.GridWay.org
>> >> >> OpenNebula Virtual Infrastructure Engine: http://www.OpenNebula.org
>> >> >
>> >> >
>> >>
>> >>
>> >>
>> >> --
>> >> Javier Fontan, Grid & Virtualization Technology Engineer/Researcher
>> >> DSA Research Group: http://dsa-research.org
>> >> Globus GridWay Metascheduler: http://www.GridWay.org
>> >> OpenNebula Virtual Infrastructure Engine: http://www.OpenNebula.org
>> >
>> >
>> >
>> > --
>> > Regards,
>> > Shashank Rachamalla
>> >
>>
>>
>>
>> --
>> Javier Fontan, Grid & Virtualization Technology Engineer/Researcher
>> DSA Research Group: http://dsa-research.org
>> Globus GridWay Metascheduler: http://www.GridWay.org
>> OpenNebula Virtual Infrastructure Engine: http://www.OpenNebula.org
>>
>
>
>
> --
> Regards,
> Shashank Rachamalla
>



-- 
Regards,
Shashank Rachamalla
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.opennebula.org/pipermail/users-opennebula.org/attachments/20101112/e6a25f1f/attachment-0003.htm>


More information about the Users mailing list