[one-users] Nebula 4.0.1 Xen 4.1.4 Debian Wheezy - MIGRATION problem..

Javier Fontan jfontan at opennebula.org
Mon Jun 24 03:52:31 PDT 2013


I can not find a reason why cold migration is not working.

Yes, live migration only works with shared storage.

On Thu, Jun 13, 2013 at 11:03 AM, Jacek Jarosiewicz
<nebula at supermedia.pl> wrote:
> both hosts are exactly the same software-wise (same versions of OS, same
> distributions, same versions of opennebula, same versions of xen).
>
> processors are different though, one host has Intel Xeon E5430, and the
> other has Intel Core i5 760.
>
> so live migration can be done only with shared storage?
>
> J
>
>
> On 06/13/2013 10:14 AM, Javier Fontan wrote:
>>
>> In live migration nobody copies the image, it needs to reside in a
>> shared filesystem mounted in both hosts.
>>
>> The cold migration problem is a bit more tricky as it suspends the VM,
>> copies everything and starts it again in the new host. Can you check
>> that both hosts have the exact same version of xen? Check also that
>> the processors are the same.
>>
>>
>> On Thu, Jun 13, 2013 at 9:17 AM, Jacek Jarosiewicz <nebula at supermedia.pl>
>> wrote:
>>>
>>> Hi,
>>>
>>> No, it's not a persistent disk. It's just a regular OS image.
>>> Yes - it doesn't get copied to the other nebula host. But it seems like
>>> it
>>> doesn't even try to copy the image. The live migration error shows almost
>>> immediately. And the vm keeps running on the original host.
>>>
>>> I'm not entirely sure if it's nebula's job to copy the image, or is it
>>> Xen's
>>> job..?
>>>
>>> And the other - cold migration - it doesn't work either.. :(
>>> It copies the image and the checkpoint file to the other host, but then
>>> when
>>> it tries to boot the VM I get the error below..
>>>
>>> Cheers,
>>> J
>>>
>>>
>>> On 12.06.2013 18:29, Javier Fontan wrote:
>>>>
>>>>
>>>> It looks that it can not find a image file:
>>>>
>>>> VmError: Device 51712 (vbd) could not be connected.
>>>> /var/lib/one//datastores/0/28/disk.0 does not exist.
>>>>
>>>> Is that image a persistent disk? In that case, is it located in a
>>>> shared datastore that is not mounted in that host?
>>>>
>>>> Cheers
>>>>
>>>> On Wed, Jun 12, 2013 at 3:13 PM, Jacek Jarosiewicz
>>>> <nebula at supermedia.pl>
>>>> wrote:
>>>>>
>>>>>
>>>>> Hi,
>>>>>
>>>>> I have a problem with migrating VMs between hosts. Both cold and live
>>>>> migration.
>>>>>
>>>>> Cold migration log is:
>>>>> Wed Jun 12 12:32:24 2013 [LCM][I]: New VM state is RUNNING
>>>>> Wed Jun 12 12:32:41 2013 [VMM][I]: ExitCode: 0
>>>>> Wed Jun 12 12:39:56 2013 [LCM][I]: New VM state is SAVE_MIGRATE
>>>>> Wed Jun 12 12:40:40 2013 [VMM][I]: ExitCode: 0
>>>>> Wed Jun 12 12:40:40 2013 [VMM][I]: Successfully execute virtualization
>>>>> driver operation: save.
>>>>> Wed Jun 12 12:40:40 2013 [VMM][I]: ExitCode: 0
>>>>> Wed Jun 12 12:40:40 2013 [VMM][I]: Successfully execute network driver
>>>>> operation: clean.
>>>>> Wed Jun 12 12:40:40 2013 [LCM][I]: New VM state is PROLOG_MIGRATE
>>>>> Wed Jun 12 12:40:40 2013 [TM][I]: ExitCode: 0
>>>>> Wed Jun 12 12:41:18 2013 [LCM][E]: monitor_done_action, VM in a wrong
>>>>> state
>>>>> Wed Jun 12 12:46:29 2013 [LCM][E]: monitor_done_action, VM in a wrong
>>>>> state
>>>>> Wed Jun 12 12:51:40 2013 [LCM][E]: monitor_done_action, VM in a wrong
>>>>> state
>>>>> Wed Jun 12 12:56:09 2013 [TM][I]: mv: Moving
>>>>> nebula1:/var/lib/one/datastores/0/29 to
>>>>> nebula0:/var/lib/one/datastores/0/29
>>>>> Wed Jun 12 12:56:09 2013 [TM][I]: ExitCode: 0
>>>>> Wed Jun 12 12:56:09 2013 [LCM][I]: New VM state is BOOT
>>>>> Wed Jun 12 12:56:09 2013 [VMM][I]: ExitCode: 0
>>>>> Wed Jun 12 12:56:09 2013 [VMM][I]: Successfully execute network driver
>>>>> operation: pre.
>>>>> Wed Jun 12 12:56:32 2013 [VMM][I]: Command execution fail:
>>>>> /var/tmp/one/vmm/xen4/restore /var/lib/one//datastores/0/29/checkpoint
>>>>> nebula0 29 nebula0
>>>>> Wed Jun 12 12:56:32 2013 [VMM][E]: restore: Command "sudo /usr/sbin/xm
>>>>> restore /var/lib/one//datastores/0/29/checkpoint" failed: Error:
>>>>> /usr/lib/xen-4.1/bin/xc_restore 23 12 1 2 0 0 0 0 failed
>>>>> Wed Jun 12 12:56:32 2013 [VMM][E]: Could not restore from
>>>>> /var/lib/one//datastores/0/29/checkpoint
>>>>> Wed Jun 12 12:56:32 2013 [VMM][I]: ExitCode: 1
>>>>> Wed Jun 12 12:56:32 2013 [VMM][I]: Failed to execute virtualization
>>>>> driver
>>>>> operation: restore.
>>>>> Wed Jun 12 12:56:32 2013 [VMM][E]: Error restoring VM: Could not
>>>>> restore
>>>>> from /var/lib/one//datastores/0/29/checkpoint
>>>>> Wed Jun 12 12:56:33 2013 [DiM][I]: New VM state is FAILED
>>>>>
>>>>> and in xend.log i see:
>>>>>
>>>>> [2013-06-12 12:56:32 24698] ERROR (XendCheckpoint:357)
>>>>> /usr/lib/xen-4.1/bin/xc_restore 23 12 1 2 0 0 0 0 failed
>>>>> Traceback (most recent call last):
>>>>>     File
>>>>> "/usr/lib/xen-4.1/bin/../lib/python/xen/xend/XendCheckpoint.py",
>>>>> line
>>>>> 309, in restore
>>>>>       forkHelper(cmd, fd, handler.handler, True)
>>>>>     File
>>>>> "/usr/lib/xen-4.1/bin/../lib/python/xen/xend/XendCheckpoint.py",
>>>>> line
>>>>> 411, in forkHelper
>>>>>       raise XendError("%s failed" % string.join(cmd))
>>>>> XendError: /usr/lib/xen-4.1/bin/xc_restore 23 12 1 2 0 0 0 0 failed
>>>>> [2013-06-12 12:56:32 24698] ERROR (XendDomain:1194) Restore failed
>>>>> Traceback (most recent call last):
>>>>>     File "/usr/lib/xen-4.1/bin/../lib/python/xen/xend/XendDomain.py",
>>>>> line
>>>>> 1178, in domain_restore_fd
>>>>>       dominfo = XendCheckpoint.restore(self, fd, paused=paused,
>>>>> relocating=relocating)
>>>>>     File
>>>>> "/usr/lib/xen-4.1/bin/../lib/python/xen/xend/XendCheckpoint.py",
>>>>> line
>>>>> 358, in restore
>>>>>       raise exn
>>>>> XendError: /usr/lib/xen-4.1/bin/xc_restore 23 12 1 2 0 0 0 0 failed
>>>>>
>>>>>
>>>>> ..and with live migration i see:
>>>>>
>>>>> Wed Jun 12 12:27:16 2013 [LCM][I]: New VM state is RUNNING
>>>>> Wed Jun 12 12:27:32 2013 [VMM][I]: ExitCode: 0
>>>>> Wed Jun 12 13:34:26 2013 [LCM][I]: New VM state is MIGRATE
>>>>> Wed Jun 12 13:34:26 2013 [VMM][I]: ExitCode: 0
>>>>> Wed Jun 12 13:34:26 2013 [VMM][I]: Successfully execute transfer
>>>>> manager
>>>>> driver operation: tm_premigrate.
>>>>> Wed Jun 12 13:34:26 2013 [VMM][I]: ExitCode: 0
>>>>> Wed Jun 12 13:34:26 2013 [VMM][I]: Successfully execute network driver
>>>>> operation: pre.
>>>>> Wed Jun 12 13:37:34 2013 [VMM][I]: ExitCode: 0
>>>>> Wed Jun 12 13:37:34 2013 [VMM][I]: Successfully execute virtualization
>>>>> driver operation: migrate.
>>>>> Wed Jun 12 13:37:34 2013 [VMM][I]: ExitCode: 0
>>>>> Wed Jun 12 13:37:34 2013 [VMM][I]: Successfully execute network driver
>>>>> operation: clean.
>>>>> Wed Jun 12 13:37:34 2013 [VMM][I]: ExitCode: 0
>>>>> Wed Jun 12 13:37:34 2013 [VMM][I]: Successfully execute network driver
>>>>> operation: post.
>>>>> Wed Jun 12 13:37:34 2013 [VMM][I]: ExitCode: 0
>>>>> Wed Jun 12 13:37:34 2013 [VMM][I]: Successfully execute transfer
>>>>> manager
>>>>> driver operation: tm_postmigrate.
>>>>> Wed Jun 12 13:37:35 2013 [LCM][I]: New VM state is RUNNING
>>>>>
>>>>> and in xend.log:
>>>>>
>>>>> [2013-06-12 13:37:39 9651] ERROR (XendCheckpoint:357) Device 51712
>>>>> (vbd)
>>>>> could not be connected. /var/lib/one//datastores/0/28/disk.0 does not
>>>>> exist.
>>>>> Traceback (most recent call last):
>>>>>     File
>>>>> "/usr/lib/xen-4.1/bin/../lib/python/xen/xend/XendCheckpoint.py",
>>>>> line
>>>>> 346, in restore
>>>>>       dominfo.waitForDevices() # Wait for backends to set up
>>>>>     File
>>>>> "/usr/lib/xen-4.1/bin/../lib/python/xen/xend/XendDomainInfo.py",
>>>>> line
>>>>> 1237, in waitForDevices
>>>>>       self.getDeviceController(devclass).waitForDevices()
>>>>>     File
>>>>> "/usr/lib/xen-4.1/bin/../lib/python/xen/xend/server/DevController.py",
>>>>> line
>>>>> 140, in waitForDevices
>>>>>       return map(self.waitForDevice, self.deviceIDs())
>>>>>     File
>>>>> "/usr/lib/xen-4.1/bin/../lib/python/xen/xend/server/DevController.py",
>>>>> line
>>>>> 165, in waitForDevice
>>>>>       "%s" % (devid, self.deviceClass, err))
>>>>> VmError: Device 51712 (vbd) could not be connected.
>>>>> /var/lib/one//datastores/0/28/disk.0 does not exist.
>>>>> [2013-06-12 13:37:39 9651] ERROR (XendDomain:1194) Restore failed
>>>>> Traceback (most recent call last):
>>>>>     File "/usr/lib/xen-4.1/bin/../lib/python/xen/xend/XendDomain.py",
>>>>> line
>>>>> 1178, in domain_restore_fd
>>>>>       dominfo = XendCheckpoint.restore(self, fd, paused=paused,
>>>>> relocating=relocating)
>>>>>     File
>>>>> "/usr/lib/xen-4.1/bin/../lib/python/xen/xend/XendCheckpoint.py",
>>>>> line
>>>>> 358, in restore
>>>>>       raise exn
>>>>> VmError: Device 51712 (vbd) could not be connected.
>>>>> /var/lib/one//datastores/0/28/disk.0 does not exist.
>>>>>
>>>>> any help would be appreciated..
>>>>>
>>>>> Cheers,
>>>>> J
>>>>>
>>>>> --
>>>>> Jacek Jarosiewicz
>>>>> _______________________________________________
>>>>> Users mailing list
>>>>> Users at lists.opennebula.org
>>>>> http://lists.opennebula.org/listinfo.cgi/users-opennebula.org
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>> --
>>> Jacek Jarosiewicz
>>
>>
>>
>>
>
>
> --
> Jacek Jarosiewicz



-- 
Join us at OpenNebulaConf2013 in Berlin from the 24th to the 26th of
September 2013!

Javier Fontán Muiños
Project Engineer
OpenNebula - The Open Source Toolkit for Data Center Virtualization
www.OpenNebula.org | jfontan at opennebula.org | @OpenNebula



More information about the Users mailing list