[one-users] Opennebula 4.0, Ceph cluster migrate vm problem

Campbell, Bill bcampbell at axcess-financial.com
Wed Apr 17 09:35:59 PDT 2013


>From our experience that's because it cannot see the checkpoint file on the target hypervisor.

For migration, the Ceph driver uses the transfer manager of the 'system' datastore, and from our experience we have to do either one of two things:

1.  The /var/lib/one/datastores/0 (system datastore) directory needs to be shared between opennebula and the hypervisor nodes
2.  Change the system datastore to use the ssh transfer manager, then modify the pre and postmigrate scripts for the ssh transfer manager to copy over the deployment/checkpoint files from the deployed node

We went with option 2 to prevent the reliance on a shared storage volume (NFS, iSCSI etc.) for holding the deployment and configuration files.  

----- Original Message -----
From: "Ruben S. Montero" <rsmontero at opennebula.org>
To: users at lists.opennebula.org
Sent: Wednesday, April 17, 2013 11:48:54 AM
Subject: Re: [one-users] Opennebula 4.0, Ceph cluster migrate vm problem

Hi

Can you check that the checkpoint file is created and it has access
permissions for oneadmin?. Could it be an issue with the KVM
configuration (dynamic_ownership or user/group). From the logs it
seems that the checkpoint file is not created or cannot be read... You
can also try onevm stop/resume to test the checkpoint functionality.

Cheers

Ruben

On Tue, Apr 16, 2013 at 4:50 PM, Stefan Ivanov <s.ivanov at maxtelecom.bg> wrote:
> Hello all.
>
> I have problem with migration of virtual machines. When i try to migrate
> from hos to host i get this error:
>     VMM][I]: Command execution fail: /var/tmp/one/vmm/kvm/restore
> /var/lib/one//datastores/0/4/checkpoint gamma 4 gamma
> Tue Apr 16 17:46:30 2013 [VMM][E]: restore: Command "virsh --connect
> qemu:///system restore /var/lib/one//datastores/0/4/checkpoint" failed:
> error: Failed to restore domain from /var/lib/one//datastores/0/4/checkpoint
>
> There is my datastore info:
>     DATASTORE 107 INFORMATION
> ID             : 107
> NAME           : ceph_data
> USER           : oneadmin
> GROUP          : oneadmin
> CLUSTER        : -
> TYPE           : IMAGE
> DS_MAD         : ceph
> TM_MAD         : ceph
> BASE PATH      : /var/lib/one/datastores/107
> DISK_TYPE      : RBD
>
> PERMISSIONS
> OWNER          : um-
> GROUP          : u--
> OTHER          : ---
>
> DATASTORE TEMPLATE
> DISK_TYPE="RBD"
> DS_MAD="ceph"
> HOST="alpha"
> POOL_NAME="data"
> TM_MAD="ceph"
> TYPE="IMAGE_DS"
>
>
> There is mu error log.
>     Tue Apr 16 17:46:25 2013 [LCM][I]: New VM state is SAVE_MIGRATE
> Tue Apr 16 17:46:27 2013 [VMM][I]: ExitCode: 0
> Tue Apr 16 17:46:27 2013 [VMM][I]: Successfully execute virtualization
> driver operation: save.
> Tue Apr 16 17:46:27 2013 [VMM][I]: ExitCode: 0
> Tue Apr 16 17:46:27 2013 [VMM][I]: Successfully execute network driver
> operation: clean.
> Tue Apr 16 17:46:29 2013 [LCM][I]: New VM state is PROLOG_MIGRATE
> Tue Apr 16 17:46:29 2013 [TM][I]: ExitCode: 0
> Tue Apr 16 17:46:29 2013 [TM][I]: ExitCode: 0
> Tue Apr 16 17:46:30 2013 [LCM][I]: New VM state is BOOT
> Tue Apr 16 17:46:30 2013 [VMM][I]: ExitCode: 0
> Tue Apr 16 17:46:30 2013 [VMM][I]: Successfully execute network driver
> operation: pre.
> Tue Apr 16 17:46:30 2013 [VMM][I]: Command execution fail:
> /var/tmp/one/vmm/kvm/restore /var/lib/one//datastores/0/4/checkpoint gamma 4
> gamma
> Tue Apr 16 17:46:30 2013 [VMM][E]: restore: Command "virsh --connect
> qemu:///system restore /var/lib/one//datastores/0/4/checkpoint" failed:
> error: Failed to restore domain from /var/lib/one//datastores/0/4/checkpoint
> Tue Apr 16 17:46:30 2013 [VMM][I]: error: Failed to create file
> '/var/lib/one//datastores/0/4/checkpoint': No such file or directory
> Tue Apr 16 17:46:30 2013 [VMM][E]: Could not restore from
> /var/lib/one//datastores/0/4/checkpoint
> Tue Apr 16 17:46:30 2013 [VMM][I]: ExitCode: 1
> Tue Apr 16 17:46:30 2013 [VMM][I]: Failed to execute virtualization driver
> operation: restore.
> Tue Apr 16 17:46:30 2013 [VMM][E]: Error restoring VM: Could not restore
> from /var/lib/one//datastores/0/4/checkpoint
> Tue Apr 16 17:46:32 2013 [DiM][I]: New VM state is FAILED
>
> Any ideas what is the problem?
> _______________________________________________
> Users mailing list
> Users at lists.opennebula.org
> http://lists.opennebula.org/listinfo.cgi/users-opennebula.org



-- 
Ruben S. Montero, PhD
Project co-Lead and Chief Architect
OpenNebula - The Open Source Solution for Data Center Virtualization
www.OpenNebula.org | rsmontero at opennebula.org | @OpenNebula
_______________________________________________
Users mailing list
Users at lists.opennebula.org
http://lists.opennebula.org/listinfo.cgi/users-opennebula.org
NOTICE: Protect the information in this message in accordance with the company's security policies. If you received this message in error, immediately notify the sender and destroy all copies.



More information about the Users mailing list