[one-users] cannot migrate virtual machines in one 3.4

Tue Apr 24 09:13:39 PDT 2012

Hi Carlos,

can you send us some extra debugging info?

Create a new VM, exactly like you did with the previous email and launch it.

Supposing the VM has been deployed in dellblade01, send us the output of
the following commands

# in dellblade01
$ ls -Rl /srv/cloud/one/**var//datastores/0/<VM_ID>

# in the frontend
$ bash -xv /srv/cloud/one/**
var/remotes/tm/ssh/mv
dellblade01:/srv/cloud/one/**var//datastores/0/<VM_ID>/disk.0
 dellblade03:/srv/cloud/one/**var/datastores/0/<VM_ID>/disk.0
$ bash -xv /srv/cloud/one/**
var/remotes/tm/ssh/mv dellblade01:/srv/cloud/one/**
var//datastores/0/<VM_ID> dellblade03:/srv/cloud/one/**
var//datastores/0/<VM_ID>

Thanks!

Jaime

On Tue, Apr 24, 2012 at 5:36 PM, Carlos A. <caralla at upv.es> wrote:

> Hi,
>
> I have also checked this option, but I found also a problem.
>
> If I change the system datastore (0) to set the TM_MAD ssh and then I
> create a new VM and try to migrate it, the  vm.log fragment is next:
>
> ------------------------------**------------------
> Tue Apr 24 17:17:07 2012 [LCM][I]: New VM state is SAVE_MIGRATE
> Tue Apr 24 17:17:10 2012 [VMM][I]: ExitCode: 0
> Tue Apr 24 17:17:10 2012 [VMM][I]: Successfully execute virtualization
> driver operation: save.
> Tue Apr 24 17:17:10 2012 [VMM][I]: ExitCode: 0
> Tue Apr 24 17:17:10 2012 [VMM][I]: Successfully execute network driver
> operation: clean.
> Tue Apr 24 17:17:11 2012 [LCM][I]: New VM state is PROLOG_MIGRATE
> Tue Apr 24 17:17:11 2012 [TM][I]: ExitCode: 0
> Tue Apr 24 17:17:15 2012 [TM][I]: mv: Moving dellblade01:/srv/cloud/one/**var//datastores/0/2985
> to dellblade03:/srv/cloud/one/**var//datastores/0/2985
> Tue Apr 24 17:17:15 2012 [TM][I]: ExitCode: 0
> Tue Apr 24 17:17:15 2012 [LCM][I]: New VM state is BOOT
> Tue Apr 24 17:17:16 2012 [VMM][I]: ExitCode: 0
> Tue Apr 24 17:17:16 2012 [VMM][I]: Successfully execute network driver
> operation: pre.
> Tue Apr 24 17:17:16 2012 [VMM][I]: Command execution fail:
> /var/tmp/one/vmm/kvm/restore /srv/cloud/one/var//**datastores/0/2985/checkpoint
> dellblade03 2985 dellblade03
> Tue Apr 24 17:17:16 2012 [VMM][E]: restore: Command "virsh --connect
> qemu:///system restore /srv/cloud/one/var//**datastores/0/2985/checkpoint"
> failed: error: Failed to restore domain from /srv/cloud/one/var//**
> datastores/0/2985/checkpoint
> Tue Apr 24 17:17:16 2012 [VMM][I]: error: cannot close file: Bad file
> descriptor
> Tue Apr 24 17:17:16 2012 [VMM][E]: Could not restore from
> /srv/cloud/one/var//**datastores/0/2985/checkpoint
> Tue Apr 24 17:17:16 2012 [VMM][I]: ExitCode: 1
> Tue Apr 24 17:17:16 2012 [VMM][I]: Failed to execute virtualization driver
> operation: restore.
> Tue Apr 24 17:17:16 2012 [VMM][E]: Error restoring VM: Could not restore
> from /srv/cloud/one/var//**datastores/0/2985/checkpoint
> Tue Apr 24 17:17:16 2012 [DiM][I]: New VM state is FAILED
> ------------------------------**------------------
>
> and the transfer.1.migrate file is
>
> ------------------------------**------------------
> MV ssh dellblade01:/srv/cloud/one/**var//datastores/0/2985/disk.0
> dellblade03:/srv/cloud/one/**var//datastores/0/2985/disk.0
> MV ssh dellblade01:/srv/cloud/one/**var//datastores/0/2985
> dellblade03:/srv/cloud/one/**var//datastores/0/2985
> ------------------------------**------------------
>
> Now I have the "checkpoint" file in dellblade03, but not the disk.0. This
> is strange because the transfer.1.migrate file tries to specifically move
> the disk.0 file but not the checkpoint file. I guess that the problem is
> the order of transferences in this case. I think that moving the folder
> once the disk has been moved is deleting the just moved disk. If I manually
> create a disk.0 file, I am able to manually restore the VM using virsh
> commands. Is there any way to solve this issue?
>
> On the other hand, I cannot see why the system datastore transfer mad
> needs to be set to ssh.
>
> Regards,
> Carlos A.
>
> El 24/04/12 17:17, Ruben S. Montero escribió:
>
>  Hi,
>>
>> Yes this may be the problem. Clould you check the output of
>> onedatastore show 0 (and 1). The TM_MAD associated with the datastore
>> should be ssh. If not, could you try to update it (onedatastore
>> update). There should not be any "shared" keyword as you suggest.
>>
>> Note that the changes on the datastore (i.e. the actual TM used) are
>> only reflected on new VMs. VMs created before the changes will use the
>> original TM_MAD values...
>>
>> Cheers
>>
>> Ruben
>>
>> On Tue, Apr 24, 2012 at 5:07 PM, Carlos A.<caralla at upv.es>  wrote:
>>
>>> Hello,
>>>
>>> I am upgrading my ONE 3.2 deployment to ONE 3.4 but I have one problem
>>> with
>>> migration of VM between nodes (not live migration).
>>>
>>> When using ONE 3.2 migration was fine, but now migration fails and I
>>> cannot
>>> find how to solve this problem.
>>>
>>> I have the default datastore, that it is a "filesystem" based on ssh
>>> tm_mad
>>> (and the system datastore "0").
>>>
>>> When I migrate the VM, I find the next vm.log fragment:
>>> ------------------------------**------------------
>>> Tue Apr 24 16:53:51 2012 [LCM][I]: New VM state is SAVE_MIGRATE
>>> Tue Apr 24 16:53:54 2012 [VMM][I]: ExitCode: 0
>>> Tue Apr 24 16:53:54 2012 [VMM][I]: Successfully execute virtualization
>>> driver operation: save.
>>> Tue Apr 24 16:53:54 2012 [VMM][I]: ExitCode: 0
>>> Tue Apr 24 16:53:54 2012 [VMM][I]: Successfully execute network driver
>>> operation: clean.
>>> Tue Apr 24 16:53:54 2012 [LCM][I]: New VM state is PROLOG_MIGRATE
>>> Tue Apr 24 16:53:55 2012 [TM][I]: ExitCode: 0
>>> Tue Apr 24 16:53:55 2012 [TM][I]: ExitCode: 0
>>> Tue Apr 24 16:53:55 2012 [LCM][I]: New VM state is BOOT
>>> Tue Apr 24 16:53:55 2012 [VMM][I]: ExitCode: 0
>>> Tue Apr 24 16:53:55 2012 [VMM][I]: Successfully execute network driver
>>> operation: pre.
>>> Tue Apr 24 16:53:55 2012 [VMM][I]: Command execution fail:
>>> /var/tmp/one/vmm/kvm/restore
>>> /srv/cloud/one/var//**datastores/0/2984/checkpoint dellblade03 2984
>>> dellblade03
>>> ------------------------------**------------------
>>>
>>> And the next transfer.1.migrate file appears
>>> ------------------------------**------------------
>>> MV ssh dellblade01:/srv/cloud/one/**var//datastores/0/2984/disk.0
>>> dellblade03:/srv/cloud/one/**var//datastores/0/2984/disk.0
>>> MV shared dellblade01:/srv/cloud/one/**var//datastores/0/2984
>>> dellblade03:/srv/cloud/one/**var//datastores/0/2984
>>> ------------------------------**------------------
>>>
>>> The problem is that the disk.0 file is not transfered to dellblade03. It
>>> seems that the phase of executing the transference of files is omited.
>>>
>>> Moreover the "shared" keyword appears while there is not any shared file
>>> system (but the system one that should not be considered when moving from
>>> one host to another). But also the checkpointing file is not moved.
>>>
>>> Note: migration from one host to the same host works (as expected). So
>>> virsh
>>> is able to restore the state of a saved VM.
>>>
>>> Any idea on this?
>>>
>>> Thank you in advance.
>>>
>>> ______________________________**_________________
>>> Users mailing list
>>> Users at lists.opennebula.org
>>> http://lists.opennebula.org/**listinfo.cgi/users-opennebula.**org<http://lists.opennebula.org/listinfo.cgi/users-opennebula.org>
>>>
>>
>>
> ______________________________**_________________
> Users mailing list
> Users at lists.opennebula.org
> http://lists.opennebula.org/**listinfo.cgi/users-opennebula.**org<http://lists.opennebula.org/listinfo.cgi/users-opennebula.org>
>

-- 
Jaime Melis
Project Engineer
OpenNebula - The Open Source Toolkit for Cloud Computing
www.OpenNebula.org | jmelis at opennebula.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.opennebula.org/pipermail/users-opennebula.org/attachments/20120424/7021b056/attachment-0003.htm>