[one-users] Highly available ONE?

Keith Hudgins keith at cloudscaling.com
Thu Mar 25 14:39:49 PDT 2010


I think the shared $ONE_LOCATION in a high-availability setup should
be some type of network share (nfs, etc) from a separate system.
(getting a little hardware geeky here, I apologize) Gluster, zfs, or
similar can be used on top of a drive array for both reliability and
scaling. Even this should have some HA capability. Nexenta can do it
if you're using closed source. I'm not sure of an open source method
of an HA nfs share. I'd be very interested to know of a way.

Likewise, the scheduler should have some HA capacity. You would have a
master node in which the scheduler was operational, and a slave node
or two which would be promoted to master in the event of failure.

Callbacks in this case should be handled in a queue-like fashion. The
central database can be used for this type of messaging, to reduce
extra components. The driver messaging should write to the queue or
messaging table. Oned can periodically read from this table and run
callbacks based upon status notices in an asynchronous manner.

On Thu, Mar 25, 2010 at 4:59 PM, Ruben S. Montero <rubensm at dacya.ucm.es> wrote:
> Hi
>
> I think that the key modification is to abstract the DB engine in the
> OpenNebula core. This has been previously proposed in the list and
> there is now an open issue in the dev portal [1].
>
> In this way, we can have a master oned process and several shadow
> daemons, this daemons can listen on different host/ports so requests
> are sent to the master oned, client tools can fall back to a different
> url when the master oned does does not respond. If you are using OCCI
> or EC2 Interfaces, then the any load balancer or http proxy could
> redirect the connections to one of the shadows in case of a timeout.
>
> In this scenario we would need:
>
> 1. shared $ONE_LOCATION/var among the master and shadow oned's
>
> 2. DB following a client/server model like MySQL
>
> 3. OpenNebula Cloud API needs to handle a list of server URLs
> (OCCI/EC2 interface could just work with an http proxy or load
> balancer like nginx)
>
> There are other issues (I do not have a clear solution for these ones):
>
> 1. Monitoring (host, VMs) should be disabled for the shadows (May be
> the dameons can start in a stand-by mode and switch to fully
> operational after a given number of requests)
>
> 2. Scheduler, same considerations apply for the scheduler. Only one
> scheduler should be operational.
>
> 3. Missing callbacks. If the oned dies we are going to miss any
> pending notification from the drivers  (e.g. if you start a BOOT
> operation and oned crashes the shadow is not going to receive the
> result of that BOOT operation. The VM will stuck in boot state for
> OpenNebula and probably running in the target host)
>
> Cheers
>
> Ruben
>
> [1] http://dev.opennebula.org/issues/206
>
> On Thu, Mar 25, 2010 at 8:30 PM, Claude Noshpitz
> <cnoshpitz at attinteractive.com> wrote:
>> Hello Nebulans,
>>
>> Wondering if anyone has been thinking about how to make ONE highly available
>> by running multiple masters cooperatively.
>>
>> Among other things, this might mean abstracting out the current Sqlite
>> dependencies to rely on a more "distributable" DB solution (one which could
>> itself be independently master-slaved or otherwise made HA).  It seems
>> unlikely that one could finesse HA for Sqlite by e.g. storing its data in a
>> distributed/redundant filesystem (fancy NFS, Gluster) unless there's really
>> strong record locking.
>>
>> Another consideration would involve the consequences of multiple masters
>> sharing a pool of worker nodes -- perhaps this would "just work" if the
>> underlying DB was shared, or not.  There's the question of how to manage the
>> shared VM state in $ONE_LOCATION/var too, but that could be handled with a
>> robust shared filesystem.
>>
>> Ideas?  Opinions?
>>
>> Thanks!
>>
>> --Claude
>>
>> _______________________________________________
>> Users mailing list
>> Users at lists.opennebula.org
>> http://lists.opennebula.org/listinfo.cgi/users-opennebula.org
>>
>
>
>
> --
> Dr. Ruben Santiago Montero
> Associate Professor (Profesor Titular), Complutense University of Madrid
>
> URL: http://dsa-research.org/doku.php?id=people:ruben
> Weblog: http://blog.dsa-research.org/?author=7
> _______________________________________________
> Users mailing list
> Users at lists.opennebula.org
> http://lists.opennebula.org/listinfo.cgi/users-opennebula.org
>



More information about the Users mailing list