[one-users] Shutting down a VM from within the VM

Tue Oct 29 10:04:38 PDT 2013

On Tue, Oct 29, 2013 at 12:26 PM, Carlos Martín Sánchez
<cmartin at opennebula.org> wrote:
> Hi,
>
> On Tue, Oct 29, 2013 at 4:43 PM, Simon Boulet <simon at nostalgeek.com> wrote:
>>
>> The libvirt "paused" method I
>> suggested is a hack that works with OpenNebula and turns the VM that
>> are internally shutdown to "SUSPENDED" in OpenNebula.
>
>
> Rubén could not retrieve that 'paused' state from libvirt, no matter how the
> vm was destroyed, he always got 'stopped'. Are we missing something?

It depends of the Libvirt backend you're using and how it detects the
state change. The paused state in libvirt is supposed to be reported
when the VM is paused (and it's state, memory, etc. preserved for
being resumed later). You need to trick the hypervisor in thinking the
VM has been paused when the shutdown is initiated from inside the VM.
It's a hack, it wont work out of the box with the stock libvirt
backends.

>
>> One comment though, perhaps the extra attribute in the VM template
>> could be managed outside the core, and have this managed by a hook.
>> Ex. if someone wanted to have the Amazon
>> "instance-initiated-shutdown-behavior":
>>
>>
>>
>> - Set the oned defaut when a VM disappears to POWEROFF.
>> - Have a state change hooks that picks up the POWEROFF state change,
>> parse the VM template to see if an INITIATED_SHUTDOWN_BEHAVIOR user
>> attribute is set. If so, parse the attribute, if it's set to ex.
>> TERMINATE, cancel / delete the VM.
>
>
> I don't see any advantage to this, honestly.

Generally I think the Core should be more lightweight and make better
use of external drivers, hooks, etc. limiting the Core to state
change, consistency, scheduling events, etc. Spreading out the
workflow / drivers has much as possible makes it much more easier to
customize OpenNebula to each environments. Also keeping the Core
lightweight makes it a lot much easier to maintain and optimize.
That's why I'm generally in favour or trying to implement as much as
we can outside from the Core, when it's possible.

> If you set the default
> behaviour to DONE, you can't undo that with a hook and set the VM back to
> poweroff...

Yes, of course, it wouldn't work with a default to DONE because once
the VM has entered in DONE state it can't be recovered. But it would
work for other defaults for example POWEROFF state can be resumed
(although VM in POWEROFF can't be cancelled, it can only be
deleted....)

> Plus I think it's much safer to do it in the core. For example, when a Host
> returns a monitor failure, all the VMs are set to UNKNOWN. But this doesn't
> mean that the VM disappeared from the hypervisor, just that the VM could not
> be monitored.
>

Oh, yes, I get your point. The Core uses "disappear" for setting the
VM as UNKNOWN. I think we need to keep "disappear" as it is, or at
least keep the current UNKNOWN behaviour. If the VM can't be monitored
for some reason (the host is down, network issues, timeout, etc.), it
enters UNKNOWN state and keeps monitoring the VM every interval until
is is reported as RUNNING (or STOPPED or what ever other state
change).

What we need is a way to let the Core know that the VM was
"successfully" monitored, but that the hypervisor reported the VM is
not running.

Have you investigated Libvirt "defined" VMs list? Libvirt maintains
two different lists of VM: The "active" VMs and the "defined" VM. I'm
thinking a VM that is NOT active but that is defined is a VM that was
shutdown... If OpenNebula finds a VM is "defined" but inactive, and it
expected the VM to be active, then it knowns the VM was unexpectedly
shutdown (by the user from inside the VM, or by some admin accessing
the hypervisor directly - not through OpenNebula).

One thing to keep in mind as well for implementing this is when a Host
is rebooted it may take sometime for the hypervisor to restart all
VMs. During that time Libvirt may report a VM as "defined" but not
"active". I am not sure if that's an issue or not, perhaps it depends
of your hypervisor, and the order in which services are started at
boot (are the VMs being restarted before Libvirtd is started, etc.)

Simon