[one-dev] Live migration / recovery
Ruben S. Montero
rsmontero at opennebula.org
Sun Jan 19 13:56:03 PST 2014
Thanks, for the heads up. Added a issue for 4.6 to look at this
http://dev.opennebula.org/issues/2658
Cheers
Ruben
On Mon, Jan 13, 2014 at 5:28 PM, Gareth Bult <gareth at linux.co.uk> wrote:
> Ok, sounds good .. could I also draw your attention to the "deploy"
> script, which is the other
> script I need to modify .. my version looks like this;
>
> --
>
> mkdir -p `dirname $domain`
> cat > $domain
>
> logger "DEPLOY :: /var/lib/one/remotes/hooks/vdc/deploy.py $domain"
> /var/lib/one/remotes/hooks/vdc/deploy.py $domain
>
> data=`virsh --connect $LIBVIRT_URI create $domain`
>
> --
>
> Deploy does three relatively critical things you may wish to consider;
>
> a. adds the "sharable" flag to the libvirt config - without this libvirt
> won't live migrate (on the latest versions of libvirt)
> b. adds "error_policy=stop" to ensure the VM pauses in the case of an IO
> error, which lets you fix the IO issue, then unpause it
> c. adds "discard=unmap", which you need in order for KVM to support TRIM
> requests properly
> (we use TRIM to allow a VM to pass references to unused space to the
> back-end storage so it can be released back to the OS)
>
> If it's of any use, you're more than welcome to use my deploy.py script ..
> for what it does it seems fairly concise .. :)
>
> #!/usr/bin/python
>
> import xml.etree.ElementTree as ET
> from sys import argv
>
> original = ET.parse(argv[1])
> params = original.findall(".//disk")
> for p in params:
> if p.get('device') == 'disk': p.insert(0,ET.Element("shareable",{}))
>
> params = original.findall(".//driver")
> for p in params:
> if p.get('type') == 'raw':
> p.set("error_policy","stop")
> p.set("discard","unmap")
>
> original.write(argv[1])
>
> Regards,
> Gareth.
>
>
> --
> *Gareth Bult*
> “The odds of hitting your target go up dramatically when you aim at it.”
> See the status of my current project at http://vdc-store.com
>
>
> ------------------------------
> *From: *"Ruben S. Montero" <rsmontero at opennebula.org>
> *To: *"Gareth Bult" <gareth at linux.co.uk>
> *Cc: *dev at lists.opennebula.org
> *Sent: *Monday, 13 January, 2014 3:41:50 PM
> *Subject: *Re: [one-dev] Live migration / recovery
>
> Totally agree, in fact we've seen this in the past.
>
> This is fairly easy to add, one_vmm_exec.rb includes a pseudo-dsl to
> specify the actions, each action includes a fail action. Looking at the
> code we have also to roll-back the networking configuration on the target
> host.
>
> Added a new issue for this. In the meantime we can just use the previous
> workaround.
>
> http://dev.opennebula.org/issues/2633
>
>
>
> On Mon, Jan 13, 2014 at 2:59 PM, Gareth Bult <gareth at linux.co.uk> wrote:
>
>> Ok, this almost seems too easy ... :)
>>
>> What I was trying to avoid was more tweaking of ON files when installing
>> VDC.
>>
>> The issue I see is that each time someone upgraded ON, they will
>> potentially have a chain of
>> small patches to apply ... if I supply a few small patches based on "if
>> <vdc installed> - do something extra"
>> would you include these in the stock scripts??
>>
>> Gareth.
>>
>> --
>> *Gareth Bult*
>> “The odds of hitting your target go up dramatically when you aim at it.”
>> See the status of my current project at http://vdc-store.com
>>
>>
>> ------------------------------
>> *From: *"Ruben S. Montero" <rsmontero at opennebula.org>
>> *To: *"Gareth Bult" <gareth at linux.co.uk>
>> *Cc: *dev at lists.opennebula.org
>> *Sent: *Sunday, 12 January, 2014 10:36:30 PM
>> *Subject: *Re: [one-dev] Live migration / recovery
>>
>> Hi Gareth
>>
>> As the migrate script can be easily updated we do not provide any hook
>> for that. I'd go to kvm/migrate, and do a simple if [ $? ... after the
>> virsh command to kill the cache on the target host.
>>
>> Cheers
>>
>> Ruben
>>
>>
>> On Thu, Dec 19, 2013 at 12:54 PM, Gareth Bult <gareth at linux.co.uk> wrote:
>>
>>> Hi,
>>>
>>> I implemented live migration for the VDC driver a few weeks back and on
>>> the whole it seems to work
>>> quite well. The "premigrate" script creates a cache instance on the
>>> target server and puts the source
>>> cache into proxy mode, then the "postmigrate" script kills the original
>>> cache instance.
>>>
>>> Problem :: if the migration fails, I'm left with a running cache on both
>>> the source and target servers, with
>>> the source cache in proxy mode. I have a 1-line CLI command to revert
>>> the issue, but I need to hook into
>>> the system in order to call it.
>>>
>>> How to do I do this, I guess effectively I need something like
>>> "postmigrate_fail" .. ???
>>>
>>> tia
>>> Gareth.
>>>
>>> --
>>> *Gareth Bult*
>>> “The odds of hitting your target go up dramatically when you aim at it.”
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> Dev mailing list
>>> Dev at lists.opennebula.org
>>> http://lists.opennebula.org/listinfo.cgi/dev-opennebula.org
>>>
>>> --
>>> <http://lists.opennebula.org/listinfo.cgi/dev-opennebula.org>
>>> --
>>> Ruben S. Montero, PhD
>>> Project co-Lead and Chief Architect<http://lists.opennebula.org/listinfo.cgi/dev-opennebula.org>
>>> OpenNebula - Flexible Enterprise Cloud Made Simple
>>> <http://lists.opennebula.org/listinfo.cgi/dev-opennebula.org>
>>> www.OpenNebula.org | rsmontero at opennebula.org | @OpenNebula
>>>
>>
>>
>
>
> --
> --
> Ruben S. Montero, PhD
> Project co-Lead and Chief Architect
> OpenNebula - Flexible Enterprise Cloud Made Simple
> www.OpenNebula.org | rsmontero at opennebula.org | @OpenNebula
>
>
--
--
Ruben S. Montero, PhD
Project co-Lead and Chief Architect
OpenNebula - Flexible Enterprise Cloud Made Simple
www.OpenNebula.org | rsmontero at opennebula.org | @OpenNebula
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.opennebula.org/pipermail/dev-opennebula.org/attachments/20140119/fa25a91f/attachment-0001.htm>
More information about the Dev
mailing list