[one-users] Stop-Resume failing with shared storage

Tino Vazquez tinova at fdi.ucm.es
Thu Feb 25 06:24:47 PST 2010


Hi Ranga,

If you are using a shared repository (i'll assume you use NFS or a
similar distributed FS), then the "<vmid>/images/" is shared between
all the remote hosts, so there is no need to move the checkpoint files
and they should be available in all the nodes.

Please send us the log of the VM that is failing so we can try and
reproduce the problem.

Regards,

-Tino

--
Constantino Vázquez, Grid & Virtualization Technology
Engineer/Researcher: http://www.dsa-research.org/tinova
DSA Research Group: http://dsa-research.org
Globus GridWay Metascheduler: http://www.GridWay.org
OpenNebula Virtual Infrastructure Engine: http://www.OpenNebula.org



On Thu, Feb 18, 2010 at 2:44 AM, Rangababu Chakravarthula
<rbabu at hexagrid.com> wrote:
> We are using shared storage as defined here
>
> http://www.opennebula.org/doku.php?id=documentation:rel1.2:sm#samplea_shared_image_repository
>
> When we run onevm stop or onevm suspend it tries to do SAVE_STOP and
> SAVE_SUSPEND and creates a checkpoint file on the host
> /var/lib/one/<vmid>/images/
>
> and in the logs we see
> tm_mv.sh: Will not move, is not saving image
>
> I think it is trying to move the checkpoint file back to the management node
> and based on logic in tm_mv.sh it is not moving.
>
> Later when we try to do onevm resume , one picks a different host and tries
> to move the checkpoint file from the management node to the new host and
> again says "Will not move, is not saving image" and on the host it fails to
> bring the VM  since there is no checkpoint file on the new host.
>
> How can we ask ONE to not resume from checkpoint file but instead load from
> the disk file that is in the template.
>
> Ranga
> _______________________________________________
> Users mailing list
> Users at lists.opennebula.org
> http://lists.opennebula.org/listinfo.cgi/users-opennebula.org
>



More information about the Users mailing list