[one-users] What is the best way to reboot a storage server in order to cause the least amount of disruption to running VMs?

Tue May 6 04:38:09 PDT 2014

Hi Shankhadeep,

   Thanks for the reply. I'm just wondering about the second part of 
step 4. I can see why it would make sense as it would avoid NFS errors 
but will that not beak the hypervisor KVM processes?

         Regards,
           Gerry

On 02/05/2014 03:41, Shankhadeep Shome wrote:
> I think suspend resume would work fine, I would do the following..
>
> 1. Disable network connectivity to the system thought some firewall rule so
> you don't have active users on the system.
> 2. Pause all the VMs, this should keep your VMs safe.
> 3. Safely shut down your open nebula controller and database
> 4. Reconfigure your NFS server and restart it. You may also want to
> dismount the nfs mounts on your hypervisors
> 5. Start up the vms in such a way that the virtual infrastructure comes up
> before your application vms. Start iSCSI target vms before your initiators
> try to connect to them, etc
>
> I would take a close look at HA nfs services, glusterfs is another option
> you might like to try. You don't need anything special for glusterfs.fuse
> support in open nebula.
>
>
>
>
> On Tue, Apr 29, 2014 at 2:53 PM, Michael <michael at onlinefusion.co.uk> wrote:
>
>> Hi Gerry,
>>
>> While I don't have any recommendations on the reboot process, if you're
>> using KVM I'd certainly recommend Ceph as a solution for high availability
>> storage. We've gone as far as rolling linux distribution upgrades to our
>> storage cluster with no loss of VM connectivity.
>>
>> -Michael
>>
>>
>> On 29/04/2014 16:50, Gerry O'Brien wrote:
>>
>>> Hi,
>>>
>>>      Our OpenNebula Debian Wheezy storage server has been running for 466
>>> days and needs to be rebooted as there have been many kernel patches during
>>> this time. It exports the datastores to the hosts using NFS.
>>>
>>>      What is the best way to reboot the server in order to cause the least
>>> amount of disruption to running VMs? I've thought about suspending all
>>> running VMs, rebooting the storage server and resuming the VMs after the
>>> storages serveris back. Would this work? Obviously, any VMs that have to
>>> maintain real time connections would be in trouble, e.g. iSCSI mounts, but
>>> would an ordinary machines resume successfully after the storage server
>>> reboots. The host NFS mounts are as here: /datastores
>>> nfs vers=4,bg,rw,_netdev,fsc,rsize=32768,wsize=32768,intr,noatime
>>>
>>>      In general, is there a recommended way of providing HA storage to
>>> hosts?
>>>
>>>              Regards,
>>>                  Gerry
>>>
>>>
>> _______________________________________________
>> Users mailing list
>> Users at lists.opennebula.org
>> http://lists.opennebula.org/listinfo.cgi/users-opennebula.org
>>
>
>
> _______________________________________________
> Users mailing list
> Users at lists.opennebula.org
> http://lists.opennebula.org/listinfo.cgi/users-opennebula.org

-- 
Gerry O'Brien

Systems Manager
School of Computer Science and Statistics
Trinity College Dublin
Dublin 2
IRELAND

00 353 1 896 1341

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.opennebula.org/pipermail/users-opennebula.org/attachments/20140506/eb7279e9/attachment-0002.htm>