[one-users] onevm migrate/suspend and checkpoint files

Steven Timm timm at fnal.gov
Fri Aug 8 07:03:55 PDT 2014


We basically decided to go back to the stock rhel/centos/sci. linux
kernels.  Performance with 3.10 kernel was better but we never
could get things to migrate right.  Performance of the old kernel
is still very bad but at least we can migrate clean and
in the past few errata updates, redhat has fixed it so at least
the kernel doesn't crash the whole machine under those conditions.

Steve Timm

On Fri, 8 Aug 2014, Jaime Melis wrote:

> Hi Steven,
> unfortunately I'm not able to help with most of the email, however I can
> tell you that the underlying operation for checkpointing is the "save"
> operation.
> 
> http://wiki.libvirt.org/page/VM_lifecycle
> 
> regards,
> Jaime
> 
> 
> 
> On Thu, Jul 24, 2014 at 4:00 PM, Steven Timm <timm at fnal.gov> wrote:
>
>       When OpenNebula creates a checkpoint file either as part
>       of a onevm migrate or onevm suspend, what libvirt function
>       is it calling to do the checkpoint?
>
>       We are seeing some issues on our new Ivy Bridge hardware
>       that sometimes in the process of a (non-live) migration,
>       the clock can get confused in such a way that when the
>       virtual machine starts from the checkpoint file
>       it will be hung and the kvm process uses 100% of cpu for
>       a day or more, and then usually resolves itself.  In some
>       cases we see the clock jump very far into the future (2598),
>       which in itself can confuse a linux vm enough to hang it.
>
>       Any clues on what OpenNebula /libvirt are doing under the
>       covers?
>       Is there any reason to suspect that on Ivy Bridge hardware,
>       in which there are some 60 different cpu frequencies available
>       for cpu scaling, the rapidly fluctuating clock speeds might
>       get us into trouble--i.e. suspending the machine on one clock
>       frequency and bringig it back on a different clock frequency?
>
>       Does anyone have experience in migrating between hardware
>       generations... Ivy Bridge -> Westmere and vice versa?
>
>       Finally, has anyone run a successful combination of kernel 3.10
>       or greater and RHEL6/Centos 6/Sci. Linux 6?
>       (In particular do the stock versions of libvirt and qemu-kvm
>       play nice with the 3.10 kernel)?
>       The 2.6.32 kernel that comes with RHEL6/Centos6/Sci Linux 6 is
>       just not
>       up to dealing with virtualization on Ivy Bridge machines and it
>       has some trouble on Sandy Bridge too.
>
>       Thanks
>
>       Steve Timm
> 
> 
>
>       ------------------------------------------------------------------
>       Steven C. Timm, Ph.D  (630) 840-8525
>       timm at fnal.gov  http://home.fnal.gov/~timm/
>       Fermilab Scientific Computing Division, Scientific Computing
>       Services Quad.
>       Grid and Cloud Services Dept., Associate Dept. Head for Cloud
>       Computing
>       _______________________________________________
>       Users mailing list
>       Users at lists.opennebula.org
>       http://lists.opennebula.org/listinfo.cgi/users-opennebula.org
> 
> 
> 
> 
> --
> Jaime Melis
> Project Engineer
> OpenNebula - Flexible Enterprise Cloud Made Simple
> www.OpenNebula.org | jmelis at opennebula.org
> 
>

------------------------------------------------------------------
Steven C. Timm, Ph.D  (630) 840-8525
timm at fnal.gov  http://home.fnal.gov/~timm/
Fermilab Scientific Computing Division, Scientific Computing Services Quad.
Grid and Cloud Services Dept., Associate Dept. Head for Cloud Computing


More information about the Users mailing list