[one-users] Fwd: how running vms moved(not recreate) on another host on host error

Dmitri Chebotarov dchebota at gmu.edu
Thu Dec 5 06:45:15 PST 2013


Hi,

Did you ever figured out how to “move” VM in case when VM host goes down?
I ran into the same issue last night.
RHEL6 cluster, same OS/KVM version. One of them VM hosts went down (error) and VMs running on that host were recreated on available hosts.
Recreated VMs lost work progress.

RHEL6 cluster is using shared NFS storage for system and “data” datastores.
Once host died, ONE attempts to connect to the dead host to access system datastore, which is already mounted on ONE controller under the same path (log below).
This is how system datastore configured:

TYPE: SYSTEM_DS
DISK_TYPE: file
TM_MAD: shared.

It’s mounted on all cluster nodes and ONED controller under the same path (/var/lib/one).
I’m probably missing something in system datastore configuration, which would tell ONED to access it locally, not via dead VM host…

Shouldn’t ONED start VMs on available host using existing config/disk files in system datastore?
And not delete/recreated it?

Thank you.

Thu Dec 5 04:49:29 2013 [VMM][I]: Command execution fail: /var/tmp/one/vnm/ovswitch/clean
Thu Dec 5 04:49:29 2013 [VMM][I]: ssh: connect to host BC4-10 port 22: No route to host
Thu Dec 5 04:49:29 2013 [VMM][I]: ExitSSHCode: 255
Thu Dec 5 04:49:29 2013 [VMM][E]: Error connecting to BC4-10
Thu Dec 5 04:49:29 2013 [VMM][I]: Failed to execute network driver operation: clean.
Thu Dec 5 04:49:32 2013 [VMM][I]: Command execution fail: /var/lib/one/remotes/tm/qcow2/delete BC4-10:/var/lib/one//datastores/111/11251/disk.0 11251 107
Thu Dec 5 04:49:32 2013 [VMM][I]: delete: Deleting /var/lib/one/datastores/111/11251/disk.0
Thu Dec 5 04:49:32 2013 [VMM][E]: delete: Command "rm -rf /var/lib/one/datastores/111/11251/disk.0" failed: ssh: connect to host BC4-10 port 22: No route to host
Thu Dec 5 04:49:32 2013 [VMM][E]: Error deleting /var/lib/one/datastores/111/11251/disk.0
Thu Dec 5 04:49:32 2013 [VMM][I]: ExitCode: 255
Thu Dec 5 04:49:32 2013 [VMM][I]: Failed to execute transfer manager driver operation: tm_delete.
Thu Dec 5 04:49:35 2013 [VMM][I]: Command execution fail: /var/lib/one/remotes/tm/shared/delete BC4-10:/var/lib/one//datastores/111/11251 11251 111
Thu Dec 5 04:49:35 2013 [VMM][I]: delete: Deleting /var/lib/one/datastores/111/11251
Thu Dec 5 04:49:35 2013 [VMM][E]: delete: Command "rm -rf /var/lib/one/datastores/111/11251" failed: ssh: connect to host BC4-10 port 22: No route to host
Thu Dec 5 04:49:35 2013 [VMM][E]: Error deleting /var/lib/one/datastores/111/11251
Thu Dec 5 04:49:35 2013 [VMM][I]: ExitCode: 255
Thu Dec 5 04:49:35 2013 [VMM][I]: Failed to execute transfer manager driver operation: tm_delete.
Thu Dec 5 04:49:35 2013 [VMM][I]: Host successfully cleaned.
Thu Dec 5 04:49:35 2013 [DiM][I]: New VM state is PENDING
—
Thank you,

Dmitri Chebotarov
VCL Sys Eng, Engineering & Architectural Support, TSD - Ent Servers & Messaging
223 Aquia Building, Ffx, MSN: 1B5
Phone: (703) 993-6175 | Fax: (703) 993-3404


From: Carlos Martín Sánchez <cmartin at opennebula.org<mailto:cmartin at opennebula.org>>
Date: Wednesday, September 11, 2013 at 5:44
To: Romany Nageh <engromanynageh at gmail.com<mailto:engromanynageh at gmail.com>>
Cc: "users at lists.opennebula.org<mailto:users at lists.opennebula.org>" <users at lists.opennebula.org<mailto:users at lists.opennebula.org>>
Subject: Re: [one-users] Fwd: how running vms moved(not recreate) on another host on host error

Hi,

What do you exactly mean by "move"? If you are referring to migration, that's not possible, once a host goes down, the VM state is lost.

Regards

--
Join us at OpenNebulaConf2013<http://opennebulaconf.com> in Berlin, 24-26 September, 2013
--
Carlos Martín, MSc
Project Engineer
OpenNebula - The Open-source Solution for Data Center Virtualization
www.OpenNebula.org<http://www.OpenNebula.org> | cmartin at opennebula.org<mailto:cmartin at opennebula.org> | @OpenNebula<http://twitter.com/opennebula><mailto:cmartin at opennebula.org>


On Tue, Sep 10, 2013 at 11:47 PM, Romany Nageh <engromanynageh at gmail.com<mailto:engromanynageh at gmail.com>> wrote:

HI
i am using opennebula 4.2 how to handle vms running on specific host to move (not recreate) to another host when host error(down)

please could any on help me ?
Thanks

---------- Forwarded message ----------
From: "Romany Nageh" <engromanynageh at gmail.com<mailto:engromanynageh at gmail.com>>
Date: Sep 9, 2013 9:46 PM
Subject: how running vms moved(not recreate) on another host on host error
To: <users at lists.opennebula.org<mailto:users at lists.opennebula.org>>, "Carlos Martín Sánchez" <cmartin at opennebula.org<mailto:cmartin at opennebula.org>>

HI
i am  using opennebula 4.2
how to handle vms running on specific host to move (not recreate) to another host when host error(down)

please could any on help me ?


_______________________________________________
Users mailing list
Users at lists.opennebula.org<mailto:Users at lists.opennebula.org>
http://lists.opennebula.org/listinfo.cgi/users-opennebula.org


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.opennebula.org/pipermail/users-opennebula.org/attachments/20131205/332383fe/attachment.htm>


More information about the Users mailing list