[one-dev] Live migration / recovery - suggestions

Gareth Bult gareth at linux.co.uk
Mon Jan 20 03:34:47 PST 2014


Hi James, 

> In this case it means that we are not very comfortable with 
> point 3: adding addon-specific code to the main repository. 

Sure, I agree totally, in which case there really needs to be some sort of hook. 
IMHO I don't think that making people edit ON source code to install an add-on is really ideal (??) 

Probably "the" solution would be to pass the driver name ("vdc" in this case) through to the 
script and "function Cleanup" would read something like; 

function CleanUp 
{ 
HOOK=$(dirname $0)/../../../remotes/hooks/${driver} 
if [ -d "${HOOK}" ]; then 
${HOOK}/postmigrate_fail ${deploy_id} ${dest_host} 
fi 
} 

Which would then be non-specific .. but it does need the call-chain to include a reference 
to the driver to avoid making lots of nasty lookups ... (??) 

[as per the last line of my previous email ??] 

-- 
	
Gareth Bult 
“The odds of hitting your target go up dramatically when you aim at it.” 
See the status of my current project at http://vdc-store.com 


----- Original Message -----

From: "Jaime Melis" <jmelis at opennebula.org> 
To: "Gareth Bult" <gareth at linux.co.uk> 
Cc: dev at lists.opennebula.org 
Sent: Monday, 20 January, 2014 10:43:56 AM 
Subject: Re: [one-dev] Live migration / recovery - suggestions 

Hi Gareth, 

we've been studying your proposal, and even though we agree with what you say we 
aren't 100% convinced with this solution. The issues with the proposal are the 
following: 

- As long as it's possible, we'd like to keep separate the main opennebula code 
and the addons. In this case it means that we are not very comfortable with 
point 3: adding addon-specific code to the main repository. 

- The proposed solution only solves the "migrate" issue, but other addons will 
have potentially issues with other scripts, and not necessarily with the 
"CleanUp" part of the "ssh_exec_and_log". We would like to find a more general 
solution. 

We are still thinking about this, we definitely want to solve this issue, so if 
you (or anyone else) has any ideas, please let us know. 

cheers, 
Jaime 


On Tue, Jan 14, 2014 at 2:23 PM, Gareth Bult < gareth at linux.co.uk > wrote: 



Hey Guys, I've done a little work on the migration script - this is what I've done here .. 
- be nice if something similar could be implemented @ source .. ? 

1. ssh_exec_and_log (generic change - this could be useful elsewhere..) modify as follows; 


function ssh_exec_and_log 
{ 
message=$2 
cleanup=$3 # ++ 

EXEC_LOG_ERR=`$1 2>&1 1>/dev/null` 
EXEC_LOG_RC=$? 

if [ $EXEC_LOG_RC -ne 0 ]; then 
log_error "Command \"$1\" failed: $EXEC_LOG_ERR" 
if [ ! -z $cleanup ]; then # ++ 
$cleanup # ++ 
fi # ++ 


if [ -n "$2" ]; then 
error_message "$2" 
else 
error_message "Error executing $1: $EXEC_LOG_ERR" 
fi 
return $EXEC_LOG_RC 
fi 
} 
i.e. allow a third parameter which is a function to call if the exec fails. 

2. migrate (for my vdc code), add "CleanUp" as a last parameter on the exec_and_log on the last line 


ssh_exec_and_log "virsh --connect $LIBVIRT_URI migrate --live $deploy_id $QEMU_PROTOCOL://$dest_host/system" \ 
"Could not migrate $deploy_id to $dest_host" CleanUp 
3. Then add the following function to migrate; 


function CleanUp 
{ 
VDC=$(dirname $0)/../../../vdc-nebula 
if [ -d "${VDC}" ]; then 
${VDC}/remotes/tm/vdc/postmigrate_fail ${deploy_id} ${dest_host} 
fi 
} 

Cleanup could be extended for other storage options ... ?? 
I guess ideally you would pass the driver through and CleanUp would become completely generic and postmigrate_fail 
would become just another standard routine?? 


hth 
Gareth. 

-- 
	
Gareth Bult 
“The odds of hitting your target go up dramatically when you aim at it.” 
See the status of my current project at http://vdc-store.com 



From: "Ruben S. Montero" < rsmontero at opennebula.org > 
To: "Gareth Bult" < gareth at linux.co.uk > 
Cc: dev at lists.opennebula.org 
Sent: Monday, 13 January, 2014 3:41:50 PM 
Subject: Re: [one-dev] Live migration / recovery 

Totally agree, in fact we've seen this in the past. 

This is fairly easy to add, one_vmm_exec.rb includes a pseudo-dsl to specify the actions, each action includes a fail action. Looking at the code we have also to roll-back the networking configuration on the target host. 

Added a new issue for this. In the meantime we can just use the previous workaround. 

http://dev.opennebula.org/issues/2633 



On Mon, Jan 13, 2014 at 2:59 PM, Gareth Bult < gareth at linux.co.uk > wrote: 

<blockquote>

Ok, this almost seems too easy ... :) 

What I was trying to avoid was more tweaking of ON files when installing VDC. 

The issue I see is that each time someone upgraded ON, they will potentially have a chain of 
small patches to apply ... if I supply a few small patches based on "if <vdc installed> - do something extra" 
would you include these in the stock scripts?? 

Gareth. 

-- 
	
Gareth Bult 
“The odds of hitting your target go up dramatically when you aim at it.” 
See the status of my current project at http://vdc-store.com 



From: "Ruben S. Montero" < rsmontero at opennebula.org > 
To: "Gareth Bult" < gareth at linux.co.uk > 
Cc: dev at lists.opennebula.org 
Sent: Sunday, 12 January, 2014 10:36:30 PM 
Subject: Re: [one-dev] Live migration / recovery 

Hi Gareth 

As the migrate script can be easily updated we do not provide any hook for that. I'd go to kvm/migrate, and do a simple if [ $? ... after the virsh command to kill the cache on the target host. 

Cheers 

Ruben 


On Thu, Dec 19, 2013 at 12:54 PM, Gareth Bult < gareth at linux.co.uk > wrote: 

<blockquote>

Hi, 

I implemented live migration for the VDC driver a few weeks back and on the whole it seems to work 
quite well. The "premigrate" script creates a cache instance on the target server and puts the source 
cache into proxy mode, then the "postmigrate" script kills the original cache instance. 

Problem :: if the migration fails, I'm left with a running cache on both the source and target servers, with 
the source cache in proxy mode. I have a 1-line CLI command to revert the issue, but I need to hook into 
the system in order to call it. 

How to do I do this, I guess effectively I need something like "postmigrate_fail" .. ??? 

tia 
Gareth. 

-- 
	Gareth Bult 
“The odds of hitting your target go up dramatically when you aim at it.” 




_______________________________________________ 
Dev mailing list 
Dev at lists.opennebula.org 
http://lists.opennebula.org/listinfo.cgi/dev-opennebula.org 

-- 
-- 
Ruben S. Montero, PhD 
Project co-Lead and Chief Architect 
OpenNebula - Flexible Enterprise Cloud Made Simple 
www.OpenNebula.org | rsmontero at opennebula.org | @OpenNebula 





</blockquote>




-- 
-- 
Ruben S. Montero, PhD 
Project co-Lead and Chief Architect 
OpenNebula - Flexible Enterprise Cloud Made Simple 
www.OpenNebula.org | rsmontero at opennebula.org | @OpenNebula 


_______________________________________________ 
Dev mailing list 
Dev at lists.opennebula.org 
http://lists.opennebula.org/listinfo.cgi/dev-opennebula.org 


</blockquote>




-- 
Jaime Melis 
Project Engineer 
OpenNebula - Flexible Enterprise Cloud Made Simple 
www.OpenNebula.org | jmelis at opennebula.org 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.opennebula.org/pipermail/dev-opennebula.org/attachments/20140120/c6ca8107/attachment-0002.htm>


More information about the Dev mailing list