[one-users] Unknown state

Rich Wellner rkw at objenv.com
Wed Oct 27 08:06:14 PDT 2010


Yeah, even better.  I like this idea if there has to be a timeout.

Though the more I think about it, the less I'm sure I understand why the 
timeout needs to exist nor why the state reverts to Running instead of 
Unknown once it triggers.  Seems like maybe the state model needs 
another node "Shutdown Failed" or something for when the guest fails to 
disappear (for example if acpid isn't installed).  Otherwise an 
administrator looking at 'onevm list' doesn't get a complete picture of 
how the current state differs from the desired state.

So, what was the use case for having the time out in the first place?

rw2

On 10/27/10 9:50 AM, Igor Rosenberg wrote:
> Humble opinion: shutdown-time is VM specific. I may have, running concurrently, an image of shutdown time ~ 10s (a tiny linux), and another with shutdown time ~ 5 minutes or more (fat J2EE container with remote DB dependencies).
>
> The shutdown-time would typically be known by the person who created originally the VM image. So can this information be embed in the image itself? If not, the VM template would be another likely place. But having a maximum value for all possible images may create problems as VMs grow in embed service complexity.
>
> -----Original Message-----
> From: users-bounces at lists.opennebula.org [mailto:users-bounces at lists.opennebula.org] On Behalf Of Rich Wellner
> Sent: miércoles, 27 de octubre de 2010 16:19
> To: Tino Vazquez
> Cc: users at lists.opennebula.org
> Subject: Re: [one-users] Unknown state
>
> Ok, I'll check that out.  FYI: my RHEL 5.5 machines on reasonably
> capable hardware take longer than that default.  Might be worth
> considering a longer default.
>
> rw2
>
> On 10/27/10 8:33 AM, Tino Vazquez wrote:
>> Hi Rich,
>>
>> OpenNebula ceases its monitoring when the VM enters the shutdown
>> state. What is probably happening is that the VM takes more time to
>> shutdown than the default timeout, which is 40 seconds (20 iterations
>> over a 2 seconds sleep), so for OpenNebula is like if the shutdown
>> failed. This timeout default can be adjusted in
>> $ONE_LOCATION/bin/remotes/vmm/kvm/shutdown.
>>
>> Best regards,
>>
>> -Tino
>>
>> --
>> Constantino Vázquez Blanco | dsa-research.org/tinova
>> Virtualization Technology Engineer / Researcher
>> OpenNebula Toolkit | opennebula.org
>>
>>
>>
>> On Wed, Oct 27, 2010 at 1:08 AM, Rich Wellner<rkw at objenv.com>   wrote:
>>> Hey guys,
>>>
>>> I have monitoring turned down to a minute so that I don't have much latency
>>> on my management while we're doing testing.  As a result, when I do a
>>> shutdown on a vm sometimes the shutdown isn't complete before the next
>>> monitoring update.  What ends up happening is that the state of the machine
>>> goes from running to shutdown, then a bit later to running again.  Finally,
>>> when the guest shutdown actually complete, the state goes to unknow because
>>> One doesn't know why the machine disappeared.
>>>
>>> It would be better if this race condition were handled more elegantly and
>>> One could tolerate that the machine took a while to shutdown.  As is a
>>> manual clean-up has to happen.  I have also confirmed that my one minute
>>> monitor cycle only makes the problem more likely.  If, by coincidence,
>>> someone asks One to shutdown a vm slightly before the monitor thread kicks
>>> off, this issue shows up.  So it seems any machine that is shutdown where
>>> timeToShutdown>   timeUntilMonitorRefresh will end up in an unknown state.
>>>
>>> rw2
>>>
>>>
>>>
>>> _______________________________________________
>>> Users mailing list
>>> Users at lists.opennebula.org
>>> http://lists.opennebula.org/listinfo.cgi/users-opennebula.org
>>>
>>>
> _______________________________________________
> Users mailing list
> Users at lists.opennebula.org
> http://lists.opennebula.org/listinfo.cgi/users-opennebula.org
> ------------------------------------------------------------------
> This e-mail and the documents attached are confidential and intended
> solely for the addressee; it may also be privileged. If you receive
> this e-mail in error, please notify the sender immediately and destroy it.
> As its integrity cannot be secured on the Internet, the Atos Origin
> group liability cannot be triggered for the message content. Although
> the sender endeavours to maintain a computer virus-free network,
> the sender does not warrant that this transmission is virus-free and
> will not be liable for any damages resulting from any virus transmitted.
>
> Este mensaje y los ficheros adjuntos pueden contener informacion confidencial
> destinada solamente a la(s) persona(s) mencionadas anteriormente
> pueden estar protegidos por secreto profesional.
> Si usted recibe este correo electronico por error, gracias por informar
> inmediatamente al remitente y destruir el mensaje.
> Al no estar asegurada la integridad de este mensaje sobre la red, Atos Origin
> no se hace responsable por su contenido. Su contenido no constituye ningun
> compromiso para el grupo Atos Origin, salvo ratificacion escrita por ambas partes.
> Aunque se esfuerza al maximo por mantener su red libre de virus, el emisor
> no puede garantizar nada al respecto y no sera responsable de cualesquiera
> danos que puedan resultar de una transmision de virus.
> ------------------------------------------------------------------




More information about the Users mailing list