[one-users] Opennebula 4.0, Ceph cluster migrate vm problem

Ruben S. Montero rsmontero at opennebula.org
Wed Apr 17 09:58:14 PDT 2013


This should go straightaway to the ceph datastore documentation. Thanks Bill!

On Wed, Apr 17, 2013 at 6:35 PM, Campbell, Bill
<bcampbell at axcess-financial.com> wrote:
> From our experience that's because it cannot see the checkpoint file on the target hypervisor.
>
> For migration, the Ceph driver uses the transfer manager of the 'system' datastore, and from our experience we have to do either one of two things:
>
> 1.  The /var/lib/one/datastores/0 (system datastore) directory needs to be shared between opennebula and the hypervisor nodes
> 2.  Change the system datastore to use the ssh transfer manager, then modify the pre and postmigrate scripts for the ssh transfer manager to copy over the deployment/checkpoint files from the deployed node
>
> We went with option 2 to prevent the reliance on a shared storage volume (NFS, iSCSI etc.) for holding the deployment and configuration files.
>
> ----- Original Message -----
> From: "Ruben S. Montero" <rsmontero at opennebula.org>
> To: users at lists.opennebula.org
> Sent: Wednesday, April 17, 2013 11:48:54 AM
> Subject: Re: [one-users] Opennebula 4.0, Ceph cluster migrate vm problem
>
> Hi
>
> Can you check that the checkpoint file is created and it has access
> permissions for oneadmin?. Could it be an issue with the KVM
> configuration (dynamic_ownership or user/group). From the logs it
> seems that the checkpoint file is not created or cannot be read... You
> can also try onevm stop/resume to test the checkpoint functionality.
>
> Cheers
>
> Ruben
>
> On Tue, Apr 16, 2013 at 4:50 PM, Stefan Ivanov <s.ivanov at maxtelecom.bg> wrote:
>> Hello all.
>>
>> I have problem with migration of virtual machines. When i try to migrate
>> from hos to host i get this error:
>>     VMM][I]: Command execution fail: /var/tmp/one/vmm/kvm/restore
>> /var/lib/one//datastores/0/4/checkpoint gamma 4 gamma
>> Tue Apr 16 17:46:30 2013 [VMM][E]: restore: Command "virsh --connect
>> qemu:///system restore /var/lib/one//datastores/0/4/checkpoint" failed:
>> error: Failed to restore domain from /var/lib/one//datastores/0/4/checkpoint
>>
>> There is my datastore info:
>>     DATASTORE 107 INFORMATION
>> ID             : 107
>> NAME           : ceph_data
>> USER           : oneadmin
>> GROUP          : oneadmin
>> CLUSTER        : -
>> TYPE           : IMAGE
>> DS_MAD         : ceph
>> TM_MAD         : ceph
>> BASE PATH      : /var/lib/one/datastores/107
>> DISK_TYPE      : RBD
>>
>> PERMISSIONS
>> OWNER          : um-
>> GROUP          : u--
>> OTHER          : ---
>>
>> DATASTORE TEMPLATE
>> DISK_TYPE="RBD"
>> DS_MAD="ceph"
>> HOST="alpha"
>> POOL_NAME="data"
>> TM_MAD="ceph"
>> TYPE="IMAGE_DS"
>>
>>
>> There is mu error log.
>>     Tue Apr 16 17:46:25 2013 [LCM][I]: New VM state is SAVE_MIGRATE
>> Tue Apr 16 17:46:27 2013 [VMM][I]: ExitCode: 0
>> Tue Apr 16 17:46:27 2013 [VMM][I]: Successfully execute virtualization
>> driver operation: save.
>> Tue Apr 16 17:46:27 2013 [VMM][I]: ExitCode: 0
>> Tue Apr 16 17:46:27 2013 [VMM][I]: Successfully execute network driver
>> operation: clean.
>> Tue Apr 16 17:46:29 2013 [LCM][I]: New VM state is PROLOG_MIGRATE
>> Tue Apr 16 17:46:29 2013 [TM][I]: ExitCode: 0
>> Tue Apr 16 17:46:29 2013 [TM][I]: ExitCode: 0
>> Tue Apr 16 17:46:30 2013 [LCM][I]: New VM state is BOOT
>> Tue Apr 16 17:46:30 2013 [VMM][I]: ExitCode: 0
>> Tue Apr 16 17:46:30 2013 [VMM][I]: Successfully execute network driver
>> operation: pre.
>> Tue Apr 16 17:46:30 2013 [VMM][I]: Command execution fail:
>> /var/tmp/one/vmm/kvm/restore /var/lib/one//datastores/0/4/checkpoint gamma 4
>> gamma
>> Tue Apr 16 17:46:30 2013 [VMM][E]: restore: Command "virsh --connect
>> qemu:///system restore /var/lib/one//datastores/0/4/checkpoint" failed:
>> error: Failed to restore domain from /var/lib/one//datastores/0/4/checkpoint
>> Tue Apr 16 17:46:30 2013 [VMM][I]: error: Failed to create file
>> '/var/lib/one//datastores/0/4/checkpoint': No such file or directory
>> Tue Apr 16 17:46:30 2013 [VMM][E]: Could not restore from
>> /var/lib/one//datastores/0/4/checkpoint
>> Tue Apr 16 17:46:30 2013 [VMM][I]: ExitCode: 1
>> Tue Apr 16 17:46:30 2013 [VMM][I]: Failed to execute virtualization driver
>> operation: restore.
>> Tue Apr 16 17:46:30 2013 [VMM][E]: Error restoring VM: Could not restore
>> from /var/lib/one//datastores/0/4/checkpoint
>> Tue Apr 16 17:46:32 2013 [DiM][I]: New VM state is FAILED
>>
>> Any ideas what is the problem?
>> _______________________________________________
>> Users mailing list
>> Users at lists.opennebula.org
>> http://lists.opennebula.org/listinfo.cgi/users-opennebula.org
>
>
>
> --
> Ruben S. Montero, PhD
> Project co-Lead and Chief Architect
> OpenNebula - The Open Source Solution for Data Center Virtualization
> www.OpenNebula.org | rsmontero at opennebula.org | @OpenNebula
> _______________________________________________
> Users mailing list
> Users at lists.opennebula.org
> http://lists.opennebula.org/listinfo.cgi/users-opennebula.org
> NOTICE: Protect the information in this message in accordance with the company's security policies. If you received this message in error, immediately notify the sender and destroy all copies.



-- 
Ruben S. Montero, PhD
Project co-Lead and Chief Architect
OpenNebula - The Open Source Solution for Data Center Virtualization
www.OpenNebula.org | rsmontero at opennebula.org | @OpenNebula



More information about the Users mailing list