[one-users] migration not working completly

Jaime Melis j.melis at fdi.ucm.es
Mon Jul 26 08:45:15 PDT 2010


Hi Ross,

actually in my experience disabling apparmor won't work either. You will
have to modify one of its configuration files in order to make it work.

Add this:
-------8<--------
  /srv/cloud/one/var/** rw,
------->8--------
(If you have a different VMDIR change the above line accordingly).
To the end of /etc/apparmor.d/abstractions/libvirt-qemu
And restart the apparmor service.

Regards,
Jaime




On Mon, Jul 26, 2010 at 5:30 PM, Tino Vazquez <tinova at fdi.ucm.es> wrote:

> Hi Ross,
>
> Are you using Ubuntu per chance? It may be a issue with the apparmor
> service, try disabling it to see if that is the one to blame. In case
> it is, we can provide rules to disable this apparmor behavior.
>
> Regards,
>
> -Tino
>
> --
> Constantino Vázquez Blanco | dsa-research.org/tinova
> Virtualization Technology Engineer / Researcher
> OpenNebula Toolkit | opennebula.org
>
>
>
> On Mon, Jul 26, 2010 at 5:13 PM, Ross Nordeen <rjnordee at mtu.edu> wrote:
> > Tino,
> >
> > I figured out my live migrate problem which turned out to be a bad
> default gw.  As far as the migration and check pointing though I have the
> /srv/cloud/one directory shared out to all nodes via nfs and full
> permissions for oneadmin... I think it is /srv/cloud/one/var/18.  I will
> check the VM_DIR variable in the oned.conf file though and see if it is
> right.  Still if everything else is working it seems like the VM_DIR is
> exported correctly and functioning for the running vm's.
> >
> > -Ross
> >
> > ----- Original Message -----
> > From: "Tino Vazquez" <tinova at fdi.ucm.es>
> > To: "Ross Nordeen" <rjnordee at mtu.edu>
> > Cc: users at lists.opennebula.org
> > Sent: Monday, July 26, 2010 8:41:37 AM GMT -07:00 US/Canada Mountain
> > Subject: Re: [one-users] migration not working completly
> >
> > Hi Ross,
> >
> > There seems to be two issues here:
> >
> > 1) Not live/migrate between cn2 and cn1 --> could it be that the
> > oneadmin user cannot passwordlessly ssh from cn2 to cn1, but it can
> > from cn1 to cn2?
> >
> > 2) The save problem seems to come from the impossibility to save the
> > checkpoint file. This may be due to the fact that /srv/cloud/one
> > directory doesn't exist in the remote nodes, in which case you will
> > need to use the VM_DIR variable in the oned.conf file.
> >
> > Hope it helps,
> >
> > -Tino
> >
> > --
> > Constantino Vázquez Blanco | dsa-research.org/tinova
> > Virtualization Technology Engineer / Researcher
> > OpenNebula Toolkit | opennebula.org
> >
> >
> >
> > On Thu, Jul 22, 2010 at 11:39 PM, Ross Nordeen <rjnordee at mtu.edu> wrote:
> >> I have open nebula deployed with one head node and 2 compute nodes,  I
> have no problems live migrating from cn1 to cn2 but I get failures live/cold
> migrating from cn2 to cn1.  is there any reason I would not able to a) not
> save the state of any of my machines and why live-migration works one way
> but not the other??  Thanks
> >>
> >> -Ross
> >>
> >>
> >> here is my vm.log file after a live-migration, migration, and than
> suspend:
> >>
> >>
> >> Thu Jul 22 11:40:22 2010 [LCM][I]: New VM state is MIGRATE
> >> Thu Jul 22 11:40:22 2010 [VMM][I]: Command execution fail: virsh
> --connect qemu:///system migrate --live one-18 qemu+ssh://cn1/session
> >> Thu Jul 22 11:40:22 2010 [VMM][I]: STDERR follows.
> >> Thu Jul 22 11:40:22 2010 [VMM][I]: Warning: Permanently added
> 'cn2,192.168.1.105' (RSA) to the list of known hosts.
> >> Thu Jul 22 11:40:22 2010 [VMM][I]: error: cannot recv data: Connection
> reset by peer
> >> Thu Jul 22 11:40:22 2010 [VMM][I]: ExitCode: 1
> >> Thu Jul 22 11:40:22 2010 [VMM][E]: Error live-migrating VM, -
> >> Thu Jul 22 11:40:23 2010 [LCM][I]: Fail to life migrate VM. Assuming
> that the VM is still RUNNING (will poll VM).
> >> Thu Jul 22 11:40:23 2010 [VMM][D]: Monitor Information:
> >> .
> >> .
> >> .
> >> .
> >> .
> >> Thu Jul 22 15:09:04 2010 [LCM][I]: New VM state is MIGRATE
> >> Thu Jul 22 15:09:04 2010 [VMM][I]: Command execution fail: virsh
> --connect qemu:///system migrate --live one-18 qemu+ssh://cn1/session
> >> Thu Jul 22 15:09:04 2010 [VMM][I]: STDERR follows.
> >> Thu Jul 22 15:09:04 2010 [VMM][I]: Warning: Permanently added
> 'cn2,192.168.1.105' (RSA) to the list of known hosts.
> >> Thu Jul 22 15:09:04 2010 [VMM][I]: error: cannot recv data: Connection
> reset by peer
> >> Thu Jul 22 15:09:04 2010 [VMM][I]: ExitCode: 1
> >> Thu Jul 22 15:09:04 2010 [VMM][E]: Error live-migrating VM, -
> >> Thu Jul 22 15:09:05 2010 [LCM][I]: Fail to life migrate VM. Assuming
> that the VM is still RUNNING (will poll VM).
> >> Thu Jul 22 15:09:05 2010 [VMM][D]: Monitor Information:
> >> .
> >> .
> >> .
> >> .
> >> .
> >> Thu Jul 22 15:11:25 2010 [LCM][I]: New VM state is SAVE_MIGRATE
> >> Thu Jul 22 15:11:25 2010 [VMM][I]: Command execution fail: 'touch
> /srv/cloud/one/var//18/images/checkpoint;virsh --connect qemu:///system save
> one-18 /srv/cloud/one/var//18/images/checkpoint'
> >> Thu Jul 22 15:11:25 2010 [VMM][I]: STDERR follows.
> >> Thu Jul 22 15:11:25 2010 [VMM][I]: Warning: Permanently added
> 'cn2,192.168.1.105' (RSA) to the list of known hosts.
> >> Thu Jul 22 15:11:25 2010 [VMM][I]: error: Failed to save domain one-18
> to /srv/cloud/one/var//18/images/checkpoint
> >> Thu Jul 22 15:11:25 2010 [VMM][I]: error: operation failed: failed to
> create '/srv/cloud/one/var//18/images/checkpoint'
> >> Thu Jul 22 15:11:25 2010 [VMM][I]: ExitCode: 1
> >> Thu Jul 22 15:11:25 2010 [VMM][E]: Error saving VM state, -
> >> Thu Jul 22 15:11:25 2010 [LCM][I]: Fail to save VM state while
> migrating. Assuming that the VM is still RUNNING (will poll VM).
> >> Thu Jul 22 15:11:26 2010 [VMM][I]: VM running but new state from monitor
> is PAUSED.
> >> Thu Jul 22 15:11:26 2010 [LCM][I]: VM is suspended.
> >> Thu Jul 22 15:11:26 2010 [DiM][I]: New VM state is SUSPENDED
> >> Thu Jul 22 15:13:20 2010 [DiM][I]: New VM state is ACTIVE.
> >> Thu Jul 22 15:13:20 2010 [LCM][I]: Restoring VM
> >> Thu Jul 22 15:13:20 2010 [LCM][I]: New state is BOOT
> >> Thu Jul 22 15:13:21 2010 [VMM][I]: Command execution fail: virsh
> --connect qemu:///system restore /srv/cloud/one/var//18/images/checkpoint
> >> Thu Jul 22 15:13:21 2010 [VMM][I]: STDERR follows.
> >> Thu Jul 22 15:13:21 2010 [VMM][I]: Warning: Permanently added
> 'cn2,192.168.1.105' (RSA) to the list of known hosts.
> >> Thu Jul 22 15:13:21 2010 [VMM][I]: error: Failed to restore domain from
> /srv/cloud/one/var//18/images/checkpoint
> >> Thu Jul 22 15:13:21 2010 [VMM][I]: error: operation failed: cannot read
> domain image
> >> Thu Jul 22 15:13:21 2010 [VMM][I]: ExitCode: 1
> >> Thu Jul 22 15:13:21 2010 [VMM][E]: Error restoring VM, -
> >> Thu Jul 22 15:13:21 2010 [DiM][I]: New VM state is FAILED
> >> Thu Jul 22 15:13:21 2010 [TM][W]: Ignored: LOG - 18 tm_delete.sh:
> Deleting /srv/cloud/one/var//18/images
> >>
> >> Thu Jul 22 15:13:21 2010 [TM][W]: Ignored: LOG - 18 tm_delete.sh:
> Executed "rm -rf /srv/cloud/one/var//18/images".
> >>
> >> Thu Jul 22 15:13:21 2010 [TM][W]: Ignored: TRANSFER SUCCESS 18 -
> >>
> >> _______________________________________________
> >> Users mailing list
> >> Users at lists.opennebula.org
> >> http://lists.opennebula.org/listinfo.cgi/users-opennebula.org
> >>
> >
> _______________________________________________
> Users mailing list
> Users at lists.opennebula.org
> http://lists.opennebula.org/listinfo.cgi/users-opennebula.org
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.opennebula.org/pipermail/users-opennebula.org/attachments/20100726/d027b381/attachment-0002.htm>


More information about the Users mailing list