[one-users] How to do HA?

Javier Fontan jfontan at fdi.ucm.es
Tue Dec 9 09:29:18 PST 2008


Hello,

On Dec 5, 2008, at 1:30 PM, Himanshu Khona wrote:

> Hi -
> It still fails between intel and different chipset but that is  
> probably a xen issue.
> I have few questions -
> a) Is checkpoint command part of open nebula 1.0? I did not see it  
> in the documentation.

Sorry, it was my fault, I should have checked the code/documentation  
before telling you about that. In development versions before 1.0  
release we had that functionality implemented but took it out as it is  
not useful without image snapshooting (I'll comment on that latter).  
The functionality is in the driver but can not be triggered in current  
versions of ONE. Sorry again for pointing you to incorrect information.

> b) There is one problem with this procedure and its the state of the  
> VM image. To have a good state save snapshots of the images should  
> be made at the same time the save is performed. - I did not  
> understand what you mean? Can you please elaborate?

Xen hypervisor comes with a save functionality that saves the state of  
the virtual machine (memory, cpu state, virtualized devices, etc) to  
disk. When you do "xm save" of a VM it is stopped and this state is  
written to a file, after that you can start again that VM using this  
state file and it will recover all that states, so it will effectively  
will be the same machine as when it was saved (running the same  
processes and such). Checkpointing is basically the same process but  
without stopping the machine ("xm save -c"). Is like taking a  
photograph of the machine at some point. This is useful if you  
checkpoint the machine from time to time, if the machine fails you  
have the state from some time ago that you can use to bring the  
machine up again. It can be seen as a backup, if yo make a backup of a  
filesystem and 5 minutes later the filesystem crashes you have the  
state of that FS from 5 minutes ago, so you only loose that 5 minutes  
of work.

The problem with checkpointing only the VM state but not the VM image  
is that if your restart the machine from the checkpoint the filesystem  
data will be inconsistent. The VM will think that the FS is in some  
state but physically will be different. Linux (and other OSes) cache  
data, it also has metadata of FS already in memory so it will not read  
what is written in the disk after resuming operations from checkpoint.  
That is why checkpointing without image snapshooting is not useful.

> c) I thought of stop, save and start option. But once VM is bound to  
> 1 physical server, it always starts up that server only and then it  
> goes to fail mode. There is no way where I can say that physical  
> server went down, start up the image (in whatever last good state)  
> on another physical server. How do I achieve this through Opennebula?
> I am thinking your save image option + ability to start VM on  
> another physical node may just do it. How do I do that?
> I can possibly do at xen level xm create and use exact same LVM  
> partition but then that does not make much sense as I need to have a  
> centralized management.

That functionality should be given by "onevm stop"/"onevm resume".  
Stop command will stop and save the machine state, resume command will  
put the VM again in pending so the scheduler can select a physical  
host where to continue its execution.

We have not yet started looking at LVM functionality but if you are  
stopping+saving the machine there will be probably no problems as the  
machine will not continue to run, so it will not modify the disk image.

There is an interesting project that addresses some of the HA problems  
for storage, http://www.drbd.org/. We are going to look into that in  
the future to see how can we implement it as seems a good solution and  
there are already examples on how to integrate it with xen (http://www.drbd.org/users-guide-emb/ch-xen.html 
).

Bye

-- 
Javier Fontan, Grid & Virtualization Technology Engineer/Researcher
DSA Research Group: http://dsa-research.org
Globus GridWay Metascheduler: http://www.GridWay.org
OpenNebula Virtual Infrastructure Engine: http://www.OpenNebula.org




More information about the Users mailing list