[one-users] Migration issues with OpenNebula 3.8.1
Martin Herfurt
martin.herfurt at toothr.com
Mon Mar 4 03:15:05 PST 2013
Hi Javier,
thank you for your response!
Here is the VM-log of the failing migration:
Mon Mar 4 12:05:08 2013 [VMM][I]: ExitCode: 0
Mon Mar 4 12:05:08 2013 [VMM][D]: Monitor Information:
CPU : 9
Memory: 2097152
Net_TX: 1550
Net_RX: 6594
Mon Mar 4 12:05:14 2013 [LCM][I]: New VM state is SAVE_MIGRATE
Mon Mar 4 12:05:45 2013 [VMM][I]: ExitCode: 0
Mon Mar 4 12:05:45 2013 [VMM][I]: Successfully execute virtualization
driver operation: save.
Mon Mar 4 12:05:45 2013 [VMM][I]: ExitCode: 0
Mon Mar 4 12:05:45 2013 [VMM][I]: Successfully execute network driver
operation: clean.
Mon Mar 4 12:05:45 2013 [LCM][I]: New VM state is PROLOG_MIGRATE
Mon Mar 4 12:05:45 2013 [TM][I]: ExitCode: 0
Mon Mar 4 12:05:51 2013 [TM][I]: mv: Moving
server1:/var/lib/one/datastores/0/37 to server2:/var/lib/one/datastores/0/37
Mon Mar 4 12:05:51 2013 [TM][I]: ExitCode: 0
Mon Mar 4 12:05:51 2013 [LCM][I]: New VM state is BOOT
Mon Mar 4 12:05:51 2013 [VMM][I]: ExitCode: 0
Mon Mar 4 12:05:51 2013 [VMM][I]: Successfully execute network driver
operation: pre.
Mon Mar 4 12:05:51 2013 [VMM][I]: Command execution fail:
/var/lib/one/vmm/kvm/restore /var/lib/one//datastores/0/37/checkpoint
server2 37 server2
Mon Mar 4 12:05:51 2013 [VMM][E]: restore: Command "virsh --connect
qemu:///system restore /var/lib/one//datastores/0/37/checkpoint" failed:
error: Failed to restore domain from
/var/lib/one//datastores/0/37/checkpoint
Mon Mar 4 12:05:51 2013 [VMM][I]: error: Unable to allow access for
disk path /var/lib/one//datastores/0/37/disk.0: No such file or directory
Mon Mar 4 12:05:51 2013 [VMM][E]: Could not restore from
/var/lib/one//datastores/0/37/checkpoint
Mon Mar 4 12:05:51 2013 [VMM][I]: ExitCode: 1
Mon Mar 4 12:05:51 2013 [VMM][I]: Failed to execute virtualization
driver operation: restore.
Mon Mar 4 12:05:51 2013 [VMM][E]: Error restoring VM: Could not restore
from /var/lib/one//datastores/0/37/checkpoint
Mon Mar 4 12:05:52 2013 [DiM][I]: New VM state is FAILED
Here are the configurations of the datastores on both systems:
server1:
[oneadmin at server1 ~]$ onedatastore show 0
DATASTORE 0 INFORMATION
ID : 0
NAME : system
USER : oneadmin
GROUP : oneadmin
CLUSTER : myCluster
DS_MAD : -
TM_MAD : ssh
BASE PATH : /var/lib/one/datastores/0
PERMISSIONS
OWNER : um-
GROUP : u--
OTHER : ---
DATASTORE TEMPLATE
DS_MAD="-"
SYSTEM="YES"
TM_MAD="ssh"
IMAGES
[oneadmin at server1 ~]$ onedatastore show 1
DATASTORE 1 INFORMATION
ID : 1
NAME : default
USER : oneadmin
GROUP : oneadmin
CLUSTER : myCluster
DS_MAD : fs
TM_MAD : ssh
BASE PATH : /var/lib/one/datastores/1
PERMISSIONS
OWNER : um-
GROUP : u--
OTHER : u--
DATASTORE TEMPLATE
DS_MAD="fs"
TM_MAD="ssh"
server2:
[oneadmin at server2 ~]$ onedatastore show 0
DATASTORE 0 INFORMATION
ID : 0
NAME : system
USER : oneadmin
GROUP : oneadmin
CLUSTER : myCluster
DS_MAD : -
TM_MAD : ssh
BASE PATH : /var/lib/one/datastores/0
PERMISSIONS
OWNER : um-
GROUP : u--
OTHER : ---
DATASTORE TEMPLATE
DS_MAD="-"
SYSTEM="YES"
TM_MAD="ssh"
IMAGES
[oneadmin at server2 ~]$ onedatastore show 1
DATASTORE 1 INFORMATION
ID : 1
NAME : default
USER : oneadmin
GROUP : oneadmin
CLUSTER : myCluster
DS_MAD : fs
TM_MAD : ssh
BASE PATH : /var/lib/one/datastores/1
PERMISSIONS
OWNER : um-
GROUP : u--
OTHER : u--
DATASTORE TEMPLATE
DS_MAD="fs"
TM_MAD="ssh"
IMAGES
0
When investigating the issue I find no disk.0 file in the destination
datastore. IMHO that is the cause of the restore-failure, since the
deployment.0 file has a reference to it.
[oneadmin at server2 37]$ ls -lh /var/lib/one/datastores/0/37/
total 119M
-rw-rw-r-- 1 root root 119M Mar 4 2013 checkpoint
-rw-rw-r-- 1 oneadmin oneadmin 675 Mar 4 2013 deployment.0
What confuses me, is that the checkpoint file is owned by root. Could
this be part of the issue?
Thanks for your help!
Martin
Am 3/4/2013 11:27, schrieb Javier Fontan:
> Can you send us the log file of the VM that had this problem? Also
> tell us the configuration you have for the storage, that is, the
> drivers you have configured for both the system datastore (0) and the
> datastore where you have the images.
>
> On Thu, Feb 28, 2013 at 3:34 PM, Martin Herfurt
> <martin.herfurt at toothr.com> wrote:
>> Hello,
>> as I am currently starting to get acquainted with OpenNebula, I need some
>> help with save migration.
>>
>> I have setup a cluster with two equal servers. I have installed CentOS 6.3
>> with the OpenNebula Packages from the repository (version 3.8.1-2.6) on each
>> of them. Using the sunstone frontend on server1, I was able to add the two
>> hosts (server1 and server2) to the cluster. Starting to test with the
>> minimal KVM image, I achieved to create a template and could deploy VMs on
>> both hosts (datastore uses the ssh transfer manager).
>>
>> When trying to migrate a VM from server1 to server2, the VM status changes
>> to SAVE_MIGRATE and after a few seconds to FAILED.
>> After doing some research, I found out, that during the migration process,
>> the disk-image is not transferred to the target host's system datastore. All
>> there is is a file called checkpoint and a file that is called deployment.0
>> - so the restore operation fails, because the disk.0 file is missing.
>>
>> Did anyone experience these issues as well or knows how the issue is caused?
>>
>> Thanks for your help,
>> Martin
>>
>> _______________________________________________
>> Users mailing list
>> Users at lists.opennebula.org
>> http://lists.opennebula.org/listinfo.cgi/users-opennebula.org
>
>
More information about the Users
mailing list