[one-users] cannot migrate virtual machines in one 3.4

Ruben S. Montero rsmontero at opennebula.org
Wed Apr 25 13:57:58 PDT 2012


Hi,

Thanks for the update. We have changed the TMs drivers to not to set
too permissive permissions on the disk/checkpoint files. This could be
the reason that makes the previous installation work.

Cheers,

Ruben

On Wed, Apr 25, 2012 at 10:17 AM, Carlos A. <caralla at upv.es> wrote:
> Hello,
>
> I have finally managed to solve the problem.
>
> It was a problem of permissions and libvirt. I have had to set oneadmin as
> the running user for kvm, and disable the dynamic permissions. The dynamic
> permissions caused to change the ownership of the disk.0 to root when saving
> a VM. The permissions were restored to oneadmin once the VM was restored. As
> root was the owner, oneadmin had no permission to move the file.
> Deactivating the dynamic ownership solves this issue, as the owner of the
> files is oneadmin (the same user that is used to run the VMs).
>
> The issue is that I cannot guess why migration was properly working in the
> previous installation of ONE.
>
> Regards,
> Carlos A.
>
> El 25/04/12 08:38, Carlos A. escribió:
>
> Hi,
>
> $ ls -Rl /srv/cloud/one/var/datastores/0/2994
> /srv/cloud/one/var/datastores/0/2994:
> total 1055880
> -rw-r--r-- 1 oneadmin     oneadmin        653 2012-04-24 18:47 deployment.0
> -rw-r----- 1 libvirt-qemu kvm      1081212928 2012-04-24 18:47 disk.0
>
> $ bash -xv /srv/cloud/one/var/remotes/tm/ssh/mv
> dellblade01:/srv/cloud/one/var//datastores/0/2994/disk.0
> dellblade03:/srv/cloud/one/var//datastores/0/2994/disk.0
>
> #-------------------------------------------------------------------------------
> # Return if moving a disk, we will move them when moving the whole system_ds
> # directory for the VM
> #-------------------------------------------------------------------------------
> SRC_PATH=`arg_path $SRC`
> arg_path $SRC
> ++ arg_path dellblade01:/srv/cloud/one/var//datastores/0/2994/disk.0
> echo $1 | $SED 's/^[^:]*:(.*)$/\1/'
> +++ echo dellblade01:/srv/cloud/one/var//datastores/0/2994/disk.0
> +++ sed -r 's/^[^:]*:(.*)$/\1/'
> ++ ARG_PATH=/srv/cloud/one/var//datastores/0/2994/disk.0
> ++ fix_dir_slashes /srv/cloud/one/var//datastores/0/2994/disk.0
> ++ dirname /srv/cloud/one/var//datastores/0/2994/disk.0/file
> ++ sed -r 's/\/+/\//g'
> + SRC_PATH=/srv/cloud/one/var/datastores/0/2994/disk.0
> DST_PATH=`arg_path $DST`
> arg_path $DST
> ++ arg_path dellblade03:/srv/cloud/one/var//datastores/0/2994/disk.0
> echo $1 | $SED 's/^[^:]*:(.*)$/\1/'
> +++ echo dellblade03:/srv/cloud/one/var//datastores/0/2994/disk.0
> +++ sed -r 's/^[^:]*:(.*)$/\1/'
> ++ ARG_PATH=/srv/cloud/one/var//datastores/0/2994/disk.0
> ++ fix_dir_slashes /srv/cloud/one/var//datastores/0/2994/disk.0
> ++ dirname /srv/cloud/one/var//datastores/0/2994/disk.0/file
> ++ sed -r 's/\/+/\//g'
> + DST_PATH=/srv/cloud/one/var/datastores/0/2994/disk.0
>
> SRC_HOST=`arg_host $SRC`
> arg_host $SRC
> ++ arg_host dellblade01:/srv/cloud/one/var//datastores/0/2994/disk.0
> ++ echo dellblade01:/srv/cloud/one/var//datastores/0/2994/disk.0
> ++ sed -r 's/^([^:]*):.*$/\1/'
> + SRC_HOST=dellblade01
> DST_HOST=`arg_host $DST`
> arg_host $DST
> ++ arg_host dellblade03:/srv/cloud/one/var//datastores/0/2994/disk.0
> ++ echo dellblade03:/srv/cloud/one/var//datastores/0/2994/disk.0
> ++ sed -r 's/^([^:]*):.*$/\1/'
> + DST_HOST=dellblade03
>
> DST_DIR=`dirname $DST_PATH`
> dirname $DST_PATH
> ++ dirname /srv/cloud/one/var/datastores/0/2994/disk.0
> + DST_DIR=/srv/cloud/one/var/datastores/0/2994
>
> SRC_DS_DIR=`dirname  $SRC_PATH`
> dirname  $SRC_PATH
> ++ dirname /srv/cloud/one/var/datastores/0/2994/disk.0
> + SRC_DS_DIR=/srv/cloud/one/var/datastores/0/2994
> SRC_VM_DIR=`basename $SRC_PATH`
> basename $SRC_PATH
> ++ basename /srv/cloud/one/var/datastores/0/2994/disk.0
> + SRC_VM_DIR=disk.0
>
> if [ `is_disk $DST_PATH` -eq 1 ]; then
>     exit 0
> fi
> is_disk $DST_PATH
> ++ is_disk /srv/cloud/one/var/datastores/0/2994/disk.0
> ++ echo /srv/cloud/one/var/datastores/0/2994/disk.0
> ++ grep '/disk\.[0-9]\+'
> ++ '[' 0 -eq 0 ']'
> ++ echo 1
> + '[' 1 -eq 1 ']'
> + exit 0
>
> ************
>
> $ bash -xv /srv/cloud/one/var/remotes/tm/ssh/mv
> dellblade01:/srv/cloud/one/var//datastores/0/2994/
> dellblade03:/srv/cloud/one/var//datastores/0/2994/
>
> #-------------------------------------------------------------------------------
> # Return if moving a disk, we will move them when moving the whole system_ds
> # directory for the VM
> #-------------------------------------------------------------------------------
> SRC_PATH=`arg_path $SRC`
> arg_path $SRC
> ++ arg_path dellblade01:/srv/cloud/one/var//datastores/0/2994/
> echo $1 | $SED 's/^[^:]*:(.*)$/\1/'
> +++ echo dellblade01:/srv/cloud/one/var//datastores/0/2994/
> +++ sed -r 's/^[^:]*:(.*)$/\1/'
> ++ ARG_PATH=/srv/cloud/one/var//datastores/0/2994/
> ++ fix_dir_slashes /srv/cloud/one/var//datastores/0/2994/
> ++ dirname /srv/cloud/one/var//datastores/0/2994//file
> ++ sed -r 's/\/+/\//g'
> + SRC_PATH=/srv/cloud/one/var/datastores/0/2994
> DST_PATH=`arg_path $DST`
> arg_path $DST
> ++ arg_path dellblade03:/srv/cloud/one/var//datastores/0/2994/
> echo $1 | $SED 's/^[^:]*:(.*)$/\1/'
> +++ echo dellblade03:/srv/cloud/one/var//datastores/0/2994/
> +++ sed -r 's/^[^:]*:(.*)$/\1/'
> ++ ARG_PATH=/srv/cloud/one/var//datastores/0/2994/
> ++ fix_dir_slashes /srv/cloud/one/var//datastores/0/2994/
> ++ dirname /srv/cloud/one/var//datastores/0/2994//file
> ++ sed -r 's/\/+/\//g'
> + DST_PATH=/srv/cloud/one/var/datastores/0/2994
>
> SRC_HOST=`arg_host $SRC`
> arg_host $SRC
> ++ arg_host dellblade01:/srv/cloud/one/var//datastores/0/2994/
> ++ echo dellblade01:/srv/cloud/one/var//datastores/0/2994/
> ++ sed -r 's/^([^:]*):.*$/\1/'
> + SRC_HOST=dellblade01
> DST_HOST=`arg_host $DST`
> arg_host $DST
> ++ arg_host dellblade03:/srv/cloud/one/var//datastores/0/2994/
> ++ echo dellblade03:/srv/cloud/one/var//datastores/0/2994/
> ++ sed -r 's/^([^:]*):.*$/\1/'
> + DST_HOST=dellblade03
>
> DST_DIR=`dirname $DST_PATH`
> dirname $DST_PATH
> ++ dirname /srv/cloud/one/var/datastores/0/2994
> + DST_DIR=/srv/cloud/one/var/datastores/0
>
> SRC_DS_DIR=`dirname  $SRC_PATH`
> dirname  $SRC_PATH
> ++ dirname /srv/cloud/one/var/datastores/0/2994
> + SRC_DS_DIR=/srv/cloud/one/var/datastores/0
> SRC_VM_DIR=`basename $SRC_PATH`
> basename $SRC_PATH
> ++ basename /srv/cloud/one/var/datastores/0/2994
> + SRC_VM_DIR=2994
>
> if [ `is_disk $DST_PATH` -eq 1 ]; then
>     exit 0
> fi
> is_disk $DST_PATH
> ++ is_disk /srv/cloud/one/var/datastores/0/2994
> ++ echo /srv/cloud/one/var/datastores/0/2994
> ++ grep '/disk\.[0-9]\+'
> ++ '[' 1 -eq 0 ']'
> ++ echo 0
> + '[' 0 -eq 1 ']'
>
> if [ "$SRC" == "$DST" ]; then
>     log "Not moving $SRC to $DST, they are the same path"
>     exit 0
> fi
> + '[' dellblade01:/srv/cloud/one/var//datastores/0/2994/ ==
> dellblade03:/srv/cloud/one/var//datastores/0/2994/ ']'
>
> ssh_make_path "$DST_HOST" "$DST_DIR"
> + ssh_make_path dellblade03 /srv/cloud/one/var/datastores/0
> $SSH $1 sh -s 2>&1 1>/dev/null <<EOF
> if [ ! -d $2 ]; then
>    mkdir -p $2
> fi
> EOF
> ++ ssh dellblade03 sh -s
> + SSH_EXEC_ERR=
> + SSH_EXEC_RC=0
> + '[' 0 -ne 0 ']'
>
> log "Moving $SRC to $DST"
> + log 'Moving dellblade01:/srv/cloud/one/var//datastores/0/2994/ to
> dellblade03:/srv/cloud/one/var//datastores/0/2994/'
> + log_info 'Moving dellblade01:/srv/cloud/one/var//datastores/0/2994/ to
> dellblade03:/srv/cloud/one/var//datastores/0/2994/'
> + log_function INFO 'Moving
> dellblade01:/srv/cloud/one/var//datastores/0/2994/ to
> dellblade03:/srv/cloud/one/var//datastores/0/2994/'
> + echo 'INFO: mv: Moving dellblade01:/srv/cloud/one/var//datastores/0/2994/
> to dellblade03:/srv/cloud/one/var//datastores/0/2994/'
> INFO: mv: Moving dellblade01:/srv/cloud/one/var//datastores/0/2994/ to
> dellblade03:/srv/cloud/one/var//datastores/0/2994/
>
> ssh_exec_and_log "$DST_HOST" "rm -rf '$DST_PATH'" \
>     "Error removing target path to prevent overwrite errors"
> + ssh_exec_and_log dellblade03 'rm -rf
> '\''/srv/cloud/one/var/datastores/0/2994'\''' 'Error removing target path to
> prevent overwrite errors'
> $SSH $1 sh -s 2>&1 1>/dev/null <<EOF
> $2
> EOF
> ++ ssh dellblade03 sh -s
> + SSH_EXEC_ERR=
> + SSH_EXEC_RC=0
> + '[' 0 -ne 0 ']'
>
> TAR_COPY="$SSH $SRC_HOST '$TAR -C $SRC_DS_DIR -cf - $SRC_VM_DIR'"
> + TAR_COPY='ssh dellblade01 '\''tar -C /srv/cloud/one/var/datastores/0 -cf -
> 2994'\'''
> TAR_COPY="$TAR_COPY | $SSH $DST_HOST '$TAR -C $DST_DIR -xf -'"
> + TAR_COPY='ssh dellblade01 '\''tar -C /srv/cloud/one/var/datastores/0 -cf -
> 2994'\'' | ssh dellblade03 '\''tar -C /srv/cloud/one/var/datastores/0 -xf
> -'\'''
>
> exec_and_log "eval $TAR_COPY" "Error copying disk directory to target host"
> + exec_and_log 'eval ssh dellblade01 '\''tar -C
> /srv/cloud/one/var/datastores/0 -cf - 2994'\'' | ssh dellblade03 '\''tar -C
> /srv/cloud/one/var/datastores/0 -xf -'\''' 'Error copying disk directory to
> target host'
> + message='Error copying disk directory to target host'
> $1 2>&1 1>/dev/null
> ++ eval ssh dellblade01 ''\''tar' -C /srv/cloud/one/var/datastores/0 -cf -
> '2994'\''' '|' ssh dellblade03 ''\''tar' -C /srv/cloud/one/var/datastores/0
> -xf '-'\'''
> + EXEC_LOG_ERR='ssh dellblade01 '\''tar -C /srv/cloud/one/var/datastores/0
> -cf - 2994'\'' | ssh dellblade03 '\''tar -C /srv/cloud/one/var/datastores/0
> -xf -'\''
> +++ ssh dellblade01 '\''tar -C /srv/cloud/one/var/datastores/0 -cf -
> 2994'\''
> +++ ssh dellblade03 '\''tar -C /srv/cloud/one/var/datastores/0 -xf -'\'''
> + EXEC_LOG_RC=0
> + '[' 0 -ne 0 ']'
>
> exec_and_log "$SSH $SRC_HOST rm -rf $SRC_PATH"
> + exec_and_log 'ssh dellblade01 rm -rf /srv/cloud/one/var/datastores/0/2994'
> + message=
> $1 2>&1 1>/dev/null
> ++ ssh dellblade01 rm -rf /srv/cloud/one/var/datastores/0/2994
> + EXEC_LOG_ERR=
> + EXEC_LOG_RC=0
> + '[' 0 -ne 0 ']'
>
> exit 0
> + exit 0
>
> The result:
>
> root at dellblade03:~# ls -l /srv/cloud/one/var/datastores/0/2994/
> total 1055880
> -rw-r--r-- 1 oneadmin oneadmin        653 2012-04-24 18:50 deployment.0
> -rw-r----- 1 oneadmin oneadmin 1081212928 2012-04-24 18:53 disk.0
>
> I hope this helps.
>
> Regards,
> Carlos A.
>
>
> El 24/04/12 18:13, Jaime Melis escribió:
>
> Hi Carlos,
>
> can you send us some extra debugging info?
>
> Create a new VM, exactly like you did with the previous email and launch it.
>
> Supposing the VM has been deployed in dellblade01, send us the output of the
> following commands
>
> # in dellblade01
> $ ls -Rl /srv/cloud/one/var//datastores/0/<VM_ID>
>
> # in the frontend
> $ bash
> -xv /srv/cloud/one/var/remotes/tm/ssh/mv dellblade01:/srv/cloud/one/var//datastores/0/<VM_ID>/disk.0 dellblade03:/srv/cloud/one/var/datastores/0/<VM_ID>/disk.0
> $ bash
> -xv /srv/cloud/one/var/remotes/tm/ssh/mv dellblade01:/srv/cloud/one/var//datastores/0/<VM_ID> dellblade03:/srv/cloud/one/var//datastores/0/<VM_ID>
>
> Thanks!
>
> Jaime
>
> On Tue, Apr 24, 2012 at 5:36 PM, Carlos A. <caralla at upv.es> wrote:
>>
>> Hi,
>>
>> I have also checked this option, but I found also a problem.
>>
>> If I change the system datastore (0) to set the TM_MAD ssh and then I
>> create a new VM and try to migrate it, the  vm.log fragment is next:
>>
>> ------------------------------------------------
>> Tue Apr 24 17:17:07 2012 [LCM][I]: New VM state is SAVE_MIGRATE
>> Tue Apr 24 17:17:10 2012 [VMM][I]: ExitCode: 0
>> Tue Apr 24 17:17:10 2012 [VMM][I]: Successfully execute virtualization
>> driver operation: save.
>> Tue Apr 24 17:17:10 2012 [VMM][I]: ExitCode: 0
>> Tue Apr 24 17:17:10 2012 [VMM][I]: Successfully execute network driver
>> operation: clean.
>> Tue Apr 24 17:17:11 2012 [LCM][I]: New VM state is PROLOG_MIGRATE
>> Tue Apr 24 17:17:11 2012 [TM][I]: ExitCode: 0
>> Tue Apr 24 17:17:15 2012 [TM][I]: mv: Moving
>> dellblade01:/srv/cloud/one/var//datastores/0/2985 to
>> dellblade03:/srv/cloud/one/var//datastores/0/2985
>> Tue Apr 24 17:17:15 2012 [TM][I]: ExitCode: 0
>> Tue Apr 24 17:17:15 2012 [LCM][I]: New VM state is BOOT
>> Tue Apr 24 17:17:16 2012 [VMM][I]: ExitCode: 0
>> Tue Apr 24 17:17:16 2012 [VMM][I]: Successfully execute network driver
>> operation: pre.
>> Tue Apr 24 17:17:16 2012 [VMM][I]: Command execution fail:
>> /var/tmp/one/vmm/kvm/restore
>> /srv/cloud/one/var//datastores/0/2985/checkpoint dellblade03 2985
>> dellblade03
>> Tue Apr 24 17:17:16 2012 [VMM][E]: restore: Command "virsh --connect
>> qemu:///system restore /srv/cloud/one/var//datastores/0/2985/checkpoint"
>> failed: error: Failed to restore domain from
>> /srv/cloud/one/var//datastores/0/2985/checkpoint
>> Tue Apr 24 17:17:16 2012 [VMM][I]: error: cannot close file: Bad file
>> descriptor
>> Tue Apr 24 17:17:16 2012 [VMM][E]: Could not restore from
>> /srv/cloud/one/var//datastores/0/2985/checkpoint
>> Tue Apr 24 17:17:16 2012 [VMM][I]: ExitCode: 1
>> Tue Apr 24 17:17:16 2012 [VMM][I]: Failed to execute virtualization driver
>> operation: restore.
>> Tue Apr 24 17:17:16 2012 [VMM][E]: Error restoring VM: Could not restore
>> from /srv/cloud/one/var//datastores/0/2985/checkpoint
>> Tue Apr 24 17:17:16 2012 [DiM][I]: New VM state is FAILED
>> ------------------------------------------------
>>
>> and the transfer.1.migrate file is
>>
>> ------------------------------------------------
>> MV ssh dellblade01:/srv/cloud/one/var//datastores/0/2985/disk.0
>> dellblade03:/srv/cloud/one/var//datastores/0/2985/disk.0
>> MV ssh dellblade01:/srv/cloud/one/var//datastores/0/2985
>> dellblade03:/srv/cloud/one/var//datastores/0/2985
>> ------------------------------------------------
>>
>> Now I have the "checkpoint" file in dellblade03, but not the disk.0. This
>> is strange because the transfer.1.migrate file tries to specifically move
>> the disk.0 file but not the checkpoint file. I guess that the problem is the
>> order of transferences in this case. I think that moving the folder once the
>> disk has been moved is deleting the just moved disk. If I manually create a
>> disk.0 file, I am able to manually restore the VM using virsh commands. Is
>> there any way to solve this issue?
>>
>> On the other hand, I cannot see why the system datastore transfer mad
>> needs to be set to ssh.
>>
>> Regards,
>> Carlos A.
>>
>> El 24/04/12 17:17, Ruben S. Montero escribió:
>>
>>> Hi,
>>>
>>> Yes this may be the problem. Clould you check the output of
>>> onedatastore show 0 (and 1). The TM_MAD associated with the datastore
>>> should be ssh. If not, could you try to update it (onedatastore
>>> update). There should not be any "shared" keyword as you suggest.
>>>
>>> Note that the changes on the datastore (i.e. the actual TM used) are
>>> only reflected on new VMs. VMs created before the changes will use the
>>> original TM_MAD values...
>>>
>>> Cheers
>>>
>>> Ruben
>>>
>>> On Tue, Apr 24, 2012 at 5:07 PM, Carlos A.<caralla at upv.es>  wrote:
>>>>
>>>> Hello,
>>>>
>>>> I am upgrading my ONE 3.2 deployment to ONE 3.4 but I have one problem
>>>> with
>>>> migration of VM between nodes (not live migration).
>>>>
>>>> When using ONE 3.2 migration was fine, but now migration fails and I
>>>> cannot
>>>> find how to solve this problem.
>>>>
>>>> I have the default datastore, that it is a "filesystem" based on ssh
>>>> tm_mad
>>>> (and the system datastore "0").
>>>>
>>>> When I migrate the VM, I find the next vm.log fragment:
>>>> ------------------------------------------------
>>>> Tue Apr 24 16:53:51 2012 [LCM][I]: New VM state is SAVE_MIGRATE
>>>> Tue Apr 24 16:53:54 2012 [VMM][I]: ExitCode: 0
>>>> Tue Apr 24 16:53:54 2012 [VMM][I]: Successfully execute virtualization
>>>> driver operation: save.
>>>> Tue Apr 24 16:53:54 2012 [VMM][I]: ExitCode: 0
>>>> Tue Apr 24 16:53:54 2012 [VMM][I]: Successfully execute network driver
>>>> operation: clean.
>>>> Tue Apr 24 16:53:54 2012 [LCM][I]: New VM state is PROLOG_MIGRATE
>>>> Tue Apr 24 16:53:55 2012 [TM][I]: ExitCode: 0
>>>> Tue Apr 24 16:53:55 2012 [TM][I]: ExitCode: 0
>>>> Tue Apr 24 16:53:55 2012 [LCM][I]: New VM state is BOOT
>>>> Tue Apr 24 16:53:55 2012 [VMM][I]: ExitCode: 0
>>>> Tue Apr 24 16:53:55 2012 [VMM][I]: Successfully execute network driver
>>>> operation: pre.
>>>> Tue Apr 24 16:53:55 2012 [VMM][I]: Command execution fail:
>>>> /var/tmp/one/vmm/kvm/restore
>>>> /srv/cloud/one/var//datastores/0/2984/checkpoint dellblade03 2984
>>>> dellblade03
>>>> ------------------------------------------------
>>>>
>>>> And the next transfer.1.migrate file appears
>>>> ------------------------------------------------
>>>> MV ssh dellblade01:/srv/cloud/one/var//datastores/0/2984/disk.0
>>>> dellblade03:/srv/cloud/one/var//datastores/0/2984/disk.0
>>>> MV shared dellblade01:/srv/cloud/one/var//datastores/0/2984
>>>> dellblade03:/srv/cloud/one/var//datastores/0/2984
>>>> ------------------------------------------------
>>>>
>>>> The problem is that the disk.0 file is not transfered to dellblade03. It
>>>> seems that the phase of executing the transference of files is omited.
>>>>
>>>> Moreover the "shared" keyword appears while there is not any shared file
>>>> system (but the system one that should not be considered when moving
>>>> from
>>>> one host to another). But also the checkpointing file is not moved.
>>>>
>>>> Note: migration from one host to the same host works (as expected). So
>>>> virsh
>>>> is able to restore the state of a saved VM.
>>>>
>>>> Any idea on this?
>>>>
>>>> Thank you in advance.
>>>>
>>>> _______________________________________________
>>>> Users mailing list
>>>> Users at lists.opennebula.org
>>>> http://lists.opennebula.org/listinfo.cgi/users-opennebula.org
>>>
>>>
>>
>> _______________________________________________
>> Users mailing list
>> Users at lists.opennebula.org
>> http://lists.opennebula.org/listinfo.cgi/users-opennebula.org
>
>
>
>
> --
> Jaime Melis
> Project Engineer
> OpenNebula - The Open Source Toolkit for Cloud Computing
> www.OpenNebula.org | jmelis at opennebula.org
>
>
>
>
>
> _______________________________________________
> Users mailing list
> Users at lists.opennebula.org
> http://lists.opennebula.org/listinfo.cgi/users-opennebula.org
>
>



-- 
Ruben S. Montero, PhD
Project co-Lead and Chief Architect
OpenNebula - The Open Source Solution for Data Center Virtualization
www.OpenNebula.org | rsmontero at opennebula.org | @OpenNebula



More information about the Users mailing list