[one-users] Shared File System HA

Wed Mar 14 14:08:29 PDT 2012

It is plain nfsclient to nfsserver behavior. Hypervisor is just acting as
NFS client. The OS of the VM is caching the writes in memory and
periodically writing to Hard disk. During the failover the NFS client will
continue to try to write but will fail if it cannot connect to the NFS
server before the timeout happens. If connection is re-established, all the
writes will go thru.

You need to see the NFS options

*timeo
**retrans
**retry*

On Wed, Mar 14, 2012 at 12:49 PM, Marshall Grillos <mgrillos at optimalpath.com
> wrote:

>  In my design I’m looking at having the shared storage attached to the
> front-end server and provide full redundancy for both the front-end and the
> image repository.  This would then be shared to each compute node via NFS.
> ****
>
> ** **
>
> StorageArray1 ---DAS--->FrontEnd1---10gb Eth---->BladeChassis1****
>
> |****
>
> |****
>
> DRDB/Heartbeat/Pacemaker (between FrontEnd nodes)****
>
> |****
>
> |****
>
> StorageArray2 ---DAS--->FrontEnd2---10gb Eth----> BladeChassis1****
>
> ** **
>
> I planned on setting up an active/passive cluster for two front-end
> servers.  These would have completely separate storage arrays (potentially
> in separate data centers).  Using DRBD (I’m open to other solutions if they
> provide faster failover) the image repository would be mirrored between the
> storage devices.  In the event of any hardware failure
> (NIC/Controller/Power etc) a full failover would occur from Frontend1 to
> Frontend2 propagating the cluster IP address.****
>
> ** **
>
> With this setup, there would be a lag time for the heartbeat/pacemaker to
> detect the failover and the failover to occur (possibly upwards of 30
> seconds).  What will happen to the running VMs when the failover is
> performed?  Is the computing node hypervisor “smart” enough to handle a
> several second NFS outage?****
>
> ** **
>
> I’m definitely open to other solutions GlusterFS etc if they provide a
> smoother failover transition given my existing hardware configuration.****
>
> ** **
>
> Thanks,****
>
> Marshall  ****
>
> ** **
>
> *From:* Ranga Chakravarthula [mailto:rbabu at hexagrid.com]
> *Sent:* Wednesday, March 14, 2012 10:57 AM
> *To:* Marshall Grillos
> *Cc:* users at lists.opennebula.org
> *Subject:* Re: [one-users] Shared File System HA****
>
> ** **
>
> If you are looking at HA at storage level, it would be better you have
> Heartbeat/Failover on the NFS resource than failing over to secondary
> front-end server. Anyway your NFS is mounted on the compute nodes and if
> one storage goes down, heartbeat will failover to another storage. Your
> frontend doesn't have to part of this.****
>
> On Wed, Mar 14, 2012 at 10:26 AM, Marshall Grillos <
> mgrillos at optimalpath.com> wrote:****
>
> I am debating the differences between Shared and Non-shared file systems
> for an OpenNebula deployment.****
>
>  ****
>
> One concern with the shared file system is High Availability.  I am
> setting up the OpenNebula front-end with connectivity to a storage device.
> To avoid the event of a storage device failure (RAID controller, Power,
> etc) I am looking into setting up a secondary front-end server with
> attached storage.  I would use NFS to share the storage to each VM Host and
> setup DRDB for block level replication between each cluster node.  In the
> event of a storage failure, a failover would occur utilizing
> heartbeat/pacemaker to the secondary front-end server.****
>
>  ****
>
> If anyone has tested a similar setup how do the VMs handle the minimal
> outage required for the failover to occur (the several seconds required to
> failover to the secondary front-end)?  For a certain duration, wouldn’t the
> NFS mount be unavailable due to the failover mechanism?****
>
>  ****
>
> Thanks,****
>
> Marshall****
>
>
> _______________________________________________
> Users mailing list
> Users at lists.opennebula.org
> http://lists.opennebula.org/listinfo.cgi/users-opennebula.org****
>
> ** **
>
> _______________________________________________
> Users mailing list
> Users at lists.opennebula.org
> http://lists.opennebula.org/listinfo.cgi/users-opennebula.org
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.opennebula.org/pipermail/users-opennebula.org/attachments/20120314/a3e7beb2/attachment-0003.htm>