[one-users] cannot migrate virtual machines in one 3.4

Carlos A. caralla at upv.es
Wed Apr 25 01:17:43 PDT 2012


Hello,

I have finally managed to solve the problem.

It was a problem of permissions and libvirt. I have had to set oneadmin 
as the running user for kvm, and disable the dynamic permissions. The 
dynamic permissions caused to change the ownership of the disk.0 to root 
when saving a VM. The permissions were restored to oneadmin once the VM 
was restored. As root was the owner, oneadmin had no permission to move 
the file. Deactivating the dynamic ownership solves this issue, as the 
owner of the files is oneadmin (the same user that is used to run the VMs).

The issue is that I cannot guess why migration was properly working in 
the previous installation of ONE.

Regards,
Carlos A.

El 25/04/12 08:38, Carlos A. escribió:
> Hi,
>
> $ ls -Rl /srv/cloud/one/var/datastores/0/2994
> /srv/cloud/one/var/datastores/0/2994:
> total 1055880
> -rw-r--r-- 1 oneadmin     oneadmin        653 2012-04-24 18:47 
> deployment.0
> -rw-r----- 1 libvirt-qemu kvm      1081212928 2012-04-24 18:47 disk.0
>
> $ bash -xv /srv/cloud/one/var/remotes/tm/ssh/mv 
> dellblade01:/srv/cloud/one/var//datastores/0/2994/disk.0 
> dellblade03:/srv/cloud/one/var//datastores/0/2994/disk.0
>
> #-------------------------------------------------------------------------------
> # Return if moving a disk, we will move them when moving the whole 
> system_ds
> # directory for the VM
> #-------------------------------------------------------------------------------
> SRC_PATH=`arg_path $SRC`
> arg_path $SRC
> ++ arg_path dellblade01:/srv/cloud/one/var//datastores/0/2994/disk.0
> echo $1 | $SED 's/^[^:]*:(.*)$/\1/'
> +++ echo dellblade01:/srv/cloud/one/var//datastores/0/2994/disk.0
> +++ sed -r 's/^[^:]*:(.*)$/\1/'
> ++ ARG_PATH=/srv/cloud/one/var//datastores/0/2994/disk.0
> ++ fix_dir_slashes /srv/cloud/one/var//datastores/0/2994/disk.0
> ++ dirname /srv/cloud/one/var//datastores/0/2994/disk.0/file
> ++ sed -r 's/\/+/\//g'
> + SRC_PATH=/srv/cloud/one/var/datastores/0/2994/disk.0
> DST_PATH=`arg_path $DST`
> arg_path $DST
> ++ arg_path dellblade03:/srv/cloud/one/var//datastores/0/2994/disk.0
> echo $1 | $SED 's/^[^:]*:(.*)$/\1/'
> +++ echo dellblade03:/srv/cloud/one/var//datastores/0/2994/disk.0
> +++ sed -r 's/^[^:]*:(.*)$/\1/'
> ++ ARG_PATH=/srv/cloud/one/var//datastores/0/2994/disk.0
> ++ fix_dir_slashes /srv/cloud/one/var//datastores/0/2994/disk.0
> ++ dirname /srv/cloud/one/var//datastores/0/2994/disk.0/file
> ++ sed -r 's/\/+/\//g'
> + DST_PATH=/srv/cloud/one/var/datastores/0/2994/disk.0
>
> SRC_HOST=`arg_host $SRC`
> arg_host $SRC
> ++ arg_host dellblade01:/srv/cloud/one/var//datastores/0/2994/disk.0
> ++ echo dellblade01:/srv/cloud/one/var//datastores/0/2994/disk.0
> ++ sed -r 's/^([^:]*):.*$/\1/'
> + SRC_HOST=dellblade01
> DST_HOST=`arg_host $DST`
> arg_host $DST
> ++ arg_host dellblade03:/srv/cloud/one/var//datastores/0/2994/disk.0
> ++ echo dellblade03:/srv/cloud/one/var//datastores/0/2994/disk.0
> ++ sed -r 's/^([^:]*):.*$/\1/'
> + DST_HOST=dellblade03
>
> DST_DIR=`dirname $DST_PATH`
> dirname $DST_PATH
> ++ dirname /srv/cloud/one/var/datastores/0/2994/disk.0
> + DST_DIR=/srv/cloud/one/var/datastores/0/2994
>
> SRC_DS_DIR=`dirname  $SRC_PATH`
> dirname  $SRC_PATH
> ++ dirname /srv/cloud/one/var/datastores/0/2994/disk.0
> + SRC_DS_DIR=/srv/cloud/one/var/datastores/0/2994
> SRC_VM_DIR=`basename $SRC_PATH`
> basename $SRC_PATH
> ++ basename /srv/cloud/one/var/datastores/0/2994/disk.0
> + SRC_VM_DIR=disk.0
>
> if [ `is_disk $DST_PATH` -eq 1 ]; then
>     exit 0
> fi
> is_disk $DST_PATH
> ++ is_disk /srv/cloud/one/var/datastores/0/2994/disk.0
> ++ echo /srv/cloud/one/var/datastores/0/2994/disk.0
> ++ grep '/disk\.[0-9]\+'
> ++ '[' 0 -eq 0 ']'
> ++ echo 1
> + '[' 1 -eq 1 ']'
> + exit 0
>
> ************
>
> $ bash -xv /srv/cloud/one/var/remotes/tm/ssh/mv 
> dellblade01:/srv/cloud/one/var//datastores/0/2994/ 
> dellblade03:/srv/cloud/one/var//datastores/0/2994/
>
> #-------------------------------------------------------------------------------
> # Return if moving a disk, we will move them when moving the whole 
> system_ds
> # directory for the VM
> #-------------------------------------------------------------------------------
> SRC_PATH=`arg_path $SRC`
> arg_path $SRC
> ++ arg_path dellblade01:/srv/cloud/one/var//datastores/0/2994/
> echo $1 | $SED 's/^[^:]*:(.*)$/\1/'
> +++ echo dellblade01:/srv/cloud/one/var//datastores/0/2994/
> +++ sed -r 's/^[^:]*:(.*)$/\1/'
> ++ ARG_PATH=/srv/cloud/one/var//datastores/0/2994/
> ++ fix_dir_slashes /srv/cloud/one/var//datastores/0/2994/
> ++ dirname /srv/cloud/one/var//datastores/0/2994//file
> ++ sed -r 's/\/+/\//g'
> + SRC_PATH=/srv/cloud/one/var/datastores/0/2994
> DST_PATH=`arg_path $DST`
> arg_path $DST
> ++ arg_path dellblade03:/srv/cloud/one/var//datastores/0/2994/
> echo $1 | $SED 's/^[^:]*:(.*)$/\1/'
> +++ echo dellblade03:/srv/cloud/one/var//datastores/0/2994/
> +++ sed -r 's/^[^:]*:(.*)$/\1/'
> ++ ARG_PATH=/srv/cloud/one/var//datastores/0/2994/
> ++ fix_dir_slashes /srv/cloud/one/var//datastores/0/2994/
> ++ dirname /srv/cloud/one/var//datastores/0/2994//file
> ++ sed -r 's/\/+/\//g'
> + DST_PATH=/srv/cloud/one/var/datastores/0/2994
>
> SRC_HOST=`arg_host $SRC`
> arg_host $SRC
> ++ arg_host dellblade01:/srv/cloud/one/var//datastores/0/2994/
> ++ echo dellblade01:/srv/cloud/one/var//datastores/0/2994/
> ++ sed -r 's/^([^:]*):.*$/\1/'
> + SRC_HOST=dellblade01
> DST_HOST=`arg_host $DST`
> arg_host $DST
> ++ arg_host dellblade03:/srv/cloud/one/var//datastores/0/2994/
> ++ echo dellblade03:/srv/cloud/one/var//datastores/0/2994/
> ++ sed -r 's/^([^:]*):.*$/\1/'
> + DST_HOST=dellblade03
>
> DST_DIR=`dirname $DST_PATH`
> dirname $DST_PATH
> ++ dirname /srv/cloud/one/var/datastores/0/2994
> + DST_DIR=/srv/cloud/one/var/datastores/0
>
> SRC_DS_DIR=`dirname  $SRC_PATH`
> dirname  $SRC_PATH
> ++ dirname /srv/cloud/one/var/datastores/0/2994
> + SRC_DS_DIR=/srv/cloud/one/var/datastores/0
> SRC_VM_DIR=`basename $SRC_PATH`
> basename $SRC_PATH
> ++ basename /srv/cloud/one/var/datastores/0/2994
> + SRC_VM_DIR=2994
>
> if [ `is_disk $DST_PATH` -eq 1 ]; then
>     exit 0
> fi
> is_disk $DST_PATH
> ++ is_disk /srv/cloud/one/var/datastores/0/2994
> ++ echo /srv/cloud/one/var/datastores/0/2994
> ++ grep '/disk\.[0-9]\+'
> ++ '[' 1 -eq 0 ']'
> ++ echo 0
> + '[' 0 -eq 1 ']'
>
> if [ "$SRC" == "$DST" ]; then
>     log "Not moving $SRC to $DST, they are the same path"
>     exit 0
> fi
> + '[' dellblade01:/srv/cloud/one/var//datastores/0/2994/ == 
> dellblade03:/srv/cloud/one/var//datastores/0/2994/ ']'
>
> ssh_make_path "$DST_HOST" "$DST_DIR"
> + ssh_make_path dellblade03 /srv/cloud/one/var/datastores/0
> $SSH $1 sh -s 2>&1 1>/dev/null <<EOF
> if [ ! -d $2 ]; then
>    mkdir -p $2
> fi
> EOF
> ++ ssh dellblade03 sh -s
> + SSH_EXEC_ERR=
> + SSH_EXEC_RC=0
> + '[' 0 -ne 0 ']'
>
> log "Moving $SRC to $DST"
> + log 'Moving dellblade01:/srv/cloud/one/var//datastores/0/2994/ to 
> dellblade03:/srv/cloud/one/var//datastores/0/2994/'
> + log_info 'Moving dellblade01:/srv/cloud/one/var//datastores/0/2994/ 
> to dellblade03:/srv/cloud/one/var//datastores/0/2994/'
> + log_function INFO 'Moving 
> dellblade01:/srv/cloud/one/var//datastores/0/2994/ to 
> dellblade03:/srv/cloud/one/var//datastores/0/2994/'
> + echo 'INFO: mv: Moving 
> dellblade01:/srv/cloud/one/var//datastores/0/2994/ to 
> dellblade03:/srv/cloud/one/var//datastores/0/2994/'
> INFO: mv: Moving dellblade01:/srv/cloud/one/var//datastores/0/2994/ to 
> dellblade03:/srv/cloud/one/var//datastores/0/2994/
>
> ssh_exec_and_log "$DST_HOST" "rm -rf '$DST_PATH'" \
>     "Error removing target path to prevent overwrite errors"
> + ssh_exec_and_log dellblade03 'rm -rf 
> '\''/srv/cloud/one/var/datastores/0/2994'\''' 'Error removing target 
> path to prevent overwrite errors'
> $SSH $1 sh -s 2>&1 1>/dev/null <<EOF
> $2
> EOF
> ++ ssh dellblade03 sh -s
> + SSH_EXEC_ERR=
> + SSH_EXEC_RC=0
> + '[' 0 -ne 0 ']'
>
> TAR_COPY="$SSH $SRC_HOST '$TAR -C $SRC_DS_DIR -cf - $SRC_VM_DIR'"
> + TAR_COPY='ssh dellblade01 '\''tar -C /srv/cloud/one/var/datastores/0 
> -cf - 2994'\'''
> TAR_COPY="$TAR_COPY | $SSH $DST_HOST '$TAR -C $DST_DIR -xf -'"
> + TAR_COPY='ssh dellblade01 '\''tar -C /srv/cloud/one/var/datastores/0 
> -cf - 2994'\'' | ssh dellblade03 '\''tar -C 
> /srv/cloud/one/var/datastores/0 -xf -'\'''
>
> exec_and_log "eval $TAR_COPY" "Error copying disk directory to target 
> host"
> + exec_and_log 'eval ssh dellblade01 '\''tar -C 
> /srv/cloud/one/var/datastores/0 -cf - 2994'\'' | ssh dellblade03 
> '\''tar -C /srv/cloud/one/var/datastores/0 -xf -'\''' 'Error copying 
> disk directory to target host'
> + message='Error copying disk directory to target host'
> $1 2>&1 1>/dev/null
> ++ eval ssh dellblade01 ''\''tar' -C /srv/cloud/one/var/datastores/0 
> -cf - '2994'\''' '|' ssh dellblade03 ''\''tar' -C 
> /srv/cloud/one/var/datastores/0 -xf '-'\'''
> + EXEC_LOG_ERR='ssh dellblade01 '\''tar -C 
> /srv/cloud/one/var/datastores/0 -cf - 2994'\'' | ssh dellblade03 
> '\''tar -C /srv/cloud/one/var/datastores/0 -xf -'\''
> +++ ssh dellblade01 '\''tar -C /srv/cloud/one/var/datastores/0 -cf - 
> 2994'\''
> +++ ssh dellblade03 '\''tar -C /srv/cloud/one/var/datastores/0 -xf -'\'''
> + EXEC_LOG_RC=0
> + '[' 0 -ne 0 ']'
>
> exec_and_log "$SSH $SRC_HOST rm -rf $SRC_PATH"
> + exec_and_log 'ssh dellblade01 rm -rf 
> /srv/cloud/one/var/datastores/0/2994'
> + message=
> $1 2>&1 1>/dev/null
> ++ ssh dellblade01 rm -rf /srv/cloud/one/var/datastores/0/2994
> + EXEC_LOG_ERR=
> + EXEC_LOG_RC=0
> + '[' 0 -ne 0 ']'
>
> exit 0
> + exit 0
>
> The result:
>
> root at dellblade03:~# ls -l /srv/cloud/one/var/datastores/0/2994/
> total 1055880
> -rw-r--r-- 1 oneadmin oneadmin        653 2012-04-24 18:50 deployment.0
> -rw-r----- 1 oneadmin oneadmin 1081212928 2012-04-24 18:53 disk.0
>
> I hope this helps.
>
> Regards,
> Carlos A.
>
>
> El 24/04/12 18:13, Jaime Melis escribió:
>> Hi Carlos,
>>
>> can you send us some extra debugging info?
>>
>> Create a new VM, exactly like you did with the previous email and 
>> launch it.
>>
>> Supposing the VM has been deployed in dellblade01, send us the output 
>> of the following commands
>>
>> # in dellblade01
>> $ ls -Rl /srv/cloud/one/var//datastores/0/<VM_ID>
>>
>> # in the frontend
>> $ bash 
>> -xv /srv/cloud/one/var/remotes/tm/ssh/mv dellblade01:/srv/cloud/one/var//datastores/0/<VM_ID>/disk.0 dellblade03:/srv/cloud/one/var/datastores/0/<VM_ID>/disk.0
>> $ bash 
>> -xv /srv/cloud/one/var/remotes/tm/ssh/mv dellblade01:/srv/cloud/one/var//datastores/0/<VM_ID> dellblade03:/srv/cloud/one/var//datastores/0/<VM_ID>
>>
>> Thanks!
>>
>> Jaime
>>
>> On Tue, Apr 24, 2012 at 5:36 PM, Carlos A. <caralla at upv.es 
>> <mailto:caralla at upv.es>> wrote:
>>
>>     Hi,
>>
>>     I have also checked this option, but I found also a problem.
>>
>>     If I change the system datastore (0) to set the TM_MAD ssh and
>>     then I create a new VM and try to migrate it, the  vm.log
>>     fragment is next:
>>
>>     ------------------------------------------------
>>     Tue Apr 24 17:17:07 2012 [LCM][I]: New VM state is SAVE_MIGRATE
>>     Tue Apr 24 17:17:10 2012 [VMM][I]: ExitCode: 0
>>     Tue Apr 24 17:17:10 2012 [VMM][I]: Successfully execute
>>     virtualization driver operation: save.
>>     Tue Apr 24 17:17:10 2012 [VMM][I]: ExitCode: 0
>>     Tue Apr 24 17:17:10 2012 [VMM][I]: Successfully execute network
>>     driver operation: clean.
>>     Tue Apr 24 17:17:11 2012 [LCM][I]: New VM state is PROLOG_MIGRATE
>>     Tue Apr 24 17:17:11 2012 [TM][I]: ExitCode: 0
>>     Tue Apr 24 17:17:15 2012 [TM][I]: mv: Moving
>>     dellblade01:/srv/cloud/one/var//datastores/0/2985 to
>>     dellblade03:/srv/cloud/one/var//datastores/0/2985
>>     Tue Apr 24 17:17:15 2012 [TM][I]: ExitCode: 0
>>     Tue Apr 24 17:17:15 2012 [LCM][I]: New VM state is BOOT
>>     Tue Apr 24 17:17:16 2012 [VMM][I]: ExitCode: 0
>>     Tue Apr 24 17:17:16 2012 [VMM][I]: Successfully execute network
>>     driver operation: pre.
>>     Tue Apr 24 17:17:16 2012 [VMM][I]: Command execution fail:
>>     /var/tmp/one/vmm/kvm/restore
>>     /srv/cloud/one/var//datastores/0/2985/checkpoint dellblade03 2985
>>     dellblade03
>>     Tue Apr 24 17:17:16 2012 [VMM][E]: restore: Command "virsh
>>     --connect qemu:///system restore
>>     /srv/cloud/one/var//datastores/0/2985/checkpoint" failed: error:
>>     Failed to restore domain from
>>     /srv/cloud/one/var//datastores/0/2985/checkpoint
>>     Tue Apr 24 17:17:16 2012 [VMM][I]: error: cannot close file: Bad
>>     file descriptor
>>     Tue Apr 24 17:17:16 2012 [VMM][E]: Could not restore from
>>     /srv/cloud/one/var//datastores/0/2985/checkpoint
>>     Tue Apr 24 17:17:16 2012 [VMM][I]: ExitCode: 1
>>     Tue Apr 24 17:17:16 2012 [VMM][I]: Failed to execute
>>     virtualization driver operation: restore.
>>     Tue Apr 24 17:17:16 2012 [VMM][E]: Error restoring VM: Could not
>>     restore from /srv/cloud/one/var//datastores/0/2985/checkpoint
>>     Tue Apr 24 17:17:16 2012 [DiM][I]: New VM state is FAILED
>>     ------------------------------------------------
>>
>>     and the transfer.1.migrate file is
>>
>>     ------------------------------------------------
>>     MV ssh dellblade01:/srv/cloud/one/var//datastores/0/2985/disk.0
>>     dellblade03:/srv/cloud/one/var//datastores/0/2985/disk.0
>>     MV ssh dellblade01:/srv/cloud/one/var//datastores/0/2985
>>     dellblade03:/srv/cloud/one/var//datastores/0/2985
>>     ------------------------------------------------
>>
>>     Now I have the "checkpoint" file in dellblade03, but not the
>>     disk.0. This is strange because the transfer.1.migrate file tries
>>     to specifically move the disk.0 file but not the checkpoint file.
>>     I guess that the problem is the order of transferences in this
>>     case. I think that moving the folder once the disk has been moved
>>     is deleting the just moved disk. If I manually create a disk.0
>>     file, I am able to manually restore the VM using virsh commands.
>>     Is there any way to solve this issue?
>>
>>     On the other hand, I cannot see why the system datastore transfer
>>     mad needs to be set to ssh.
>>
>>     Regards,
>>     Carlos A.
>>
>>     El 24/04/12 17:17, Ruben S. Montero escribió:
>>
>>         Hi,
>>
>>         Yes this may be the problem. Clould you check the output of
>>         onedatastore show 0 (and 1). The TM_MAD associated with the
>>         datastore
>>         should be ssh. If not, could you try to update it (onedatastore
>>         update). There should not be any "shared" keyword as you suggest.
>>
>>         Note that the changes on the datastore (i.e. the actual TM
>>         used) are
>>         only reflected on new VMs. VMs created before the changes
>>         will use the
>>         original TM_MAD values...
>>
>>         Cheers
>>
>>         Ruben
>>
>>         On Tue, Apr 24, 2012 at 5:07 PM, Carlos A.<caralla at upv.es
>>         <mailto:caralla at upv.es>>  wrote:
>>
>>             Hello,
>>
>>             I am upgrading my ONE 3.2 deployment to ONE 3.4 but I
>>             have one problem with
>>             migration of VM between nodes (not live migration).
>>
>>             When using ONE 3.2 migration was fine, but now migration
>>             fails and I cannot
>>             find how to solve this problem.
>>
>>             I have the default datastore, that it is a "filesystem"
>>             based on ssh tm_mad
>>             (and the system datastore "0").
>>
>>             When I migrate the VM, I find the next vm.log fragment:
>>             ------------------------------------------------
>>             Tue Apr 24 16:53:51 2012 [LCM][I]: New VM state is
>>             SAVE_MIGRATE
>>             Tue Apr 24 16:53:54 2012 [VMM][I]: ExitCode: 0
>>             Tue Apr 24 16:53:54 2012 [VMM][I]: Successfully execute
>>             virtualization
>>             driver operation: save.
>>             Tue Apr 24 16:53:54 2012 [VMM][I]: ExitCode: 0
>>             Tue Apr 24 16:53:54 2012 [VMM][I]: Successfully execute
>>             network driver
>>             operation: clean.
>>             Tue Apr 24 16:53:54 2012 [LCM][I]: New VM state is
>>             PROLOG_MIGRATE
>>             Tue Apr 24 16:53:55 2012 [TM][I]: ExitCode: 0
>>             Tue Apr 24 16:53:55 2012 [TM][I]: ExitCode: 0
>>             Tue Apr 24 16:53:55 2012 [LCM][I]: New VM state is BOOT
>>             Tue Apr 24 16:53:55 2012 [VMM][I]: ExitCode: 0
>>             Tue Apr 24 16:53:55 2012 [VMM][I]: Successfully execute
>>             network driver
>>             operation: pre.
>>             Tue Apr 24 16:53:55 2012 [VMM][I]: Command execution fail:
>>             /var/tmp/one/vmm/kvm/restore
>>             /srv/cloud/one/var//datastores/0/2984/checkpoint
>>             dellblade03 2984
>>             dellblade03
>>             ------------------------------------------------
>>
>>             And the next transfer.1.migrate file appears
>>             ------------------------------------------------
>>             MV ssh
>>             dellblade01:/srv/cloud/one/var//datastores/0/2984/disk.0
>>             dellblade03:/srv/cloud/one/var//datastores/0/2984/disk.0
>>             MV shared dellblade01:/srv/cloud/one/var//datastores/0/2984
>>             dellblade03:/srv/cloud/one/var//datastores/0/2984
>>             ------------------------------------------------
>>
>>             The problem is that the disk.0 file is not transfered to
>>             dellblade03. It
>>             seems that the phase of executing the transference of
>>             files is omited.
>>
>>             Moreover the "shared" keyword appears while there is not
>>             any shared file
>>             system (but the system one that should not be considered
>>             when moving from
>>             one host to another). But also the checkpointing file is
>>             not moved.
>>
>>             Note: migration from one host to the same host works (as
>>             expected). So virsh
>>             is able to restore the state of a saved VM.
>>
>>             Any idea on this?
>>
>>             Thank you in advance.
>>
>>             _______________________________________________
>>             Users mailing list
>>             Users at lists.opennebula.org
>>             <mailto:Users at lists.opennebula.org>
>>             http://lists.opennebula.org/listinfo.cgi/users-opennebula.org
>>
>>
>>
>>     _______________________________________________
>>     Users mailing list
>>     Users at lists.opennebula.org <mailto:Users at lists.opennebula.org>
>>     http://lists.opennebula.org/listinfo.cgi/users-opennebula.org
>>
>>
>>
>>
>> -- 
>> Jaime Melis
>> Project Engineer
>> OpenNebula - The Open Source Toolkit for Cloud Computing
>> www.OpenNebula.org <http://www.OpenNebula.org> | 
>> jmelis at opennebula.org <mailto:jmelis at opennebula.org>
>
>
>
>
> _______________________________________________
> Users mailing list
> Users at lists.opennebula.org
> http://lists.opennebula.org/listinfo.cgi/users-opennebula.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.opennebula.org/pipermail/users-opennebula.org/attachments/20120425/4ba3922b/attachment-0003.htm>


More information about the Users mailing list