[one-users] VMs stuck in UNKNOWN State

Duverne, Cyrille cyrille.duverne at euranova.eu
Wed Apr 3 07:18:34 PDT 2013


Ok ok, that's indeed fun :

ruby -wd /var/tmp/one/vmm/kvm/poll one-294
STATE=a NETTX=19039830 USEDCPU=0.1 USEDMEMORY=1121828 NETRX=416126660

Seems that the polling is correctly working.
Possible that the state is still on cache or in the DB and not updated
or something ? 

Cheers
Cyrille

At Wednesday, 03/04/2013 on 15:15 Ruben S. Montero wrote:

Could you execute the vmm probe in the host

/var/tmp/one/vmm/kvm/poll one-294

and check for errors, or try to debug the script... (maybe running it
with ruby -wd)

Ruben

On Wed, Apr 3, 2013 at 10:42 AM, Duverne, Cyrille  wrote:

Hello,

Indeed, state is still "d" , as you can see here :

	*  Wed Apr  3 10:34:13 2013 [VMM][I]: Monitoring VM 294.

	*  Wed Apr  3 10:34:13 2013 [VMM][D]: Message received: LOG I 294
ExitCode: 0

	*  Wed Apr  3 10:34:13 2013 [VMM][D]: Message received: POLL SUCCESS
294 STATE=d

	* 
Any thought ?
By consciousness, I verified that all users etc.... were still correct
on all machines, the oneadmin is able to ssh directly etc...

Thanks in advance
Cyrille

At Tuesday, 02/04/2013 on 22:31 Ruben S. Montero wrote:

 So the VMs are now running, and correctly reported by libvirt, but
OpenNebula does not move them from UNKNOWN to RUNNING?, Are the
messages still reporting STATE=d for these VMs in oned.log?

 Ruben

On Tue, Apr 2, 2013 at 3:57 PM, Duverne, Cyrille  wrote:

Hello,

Anything new on this ?

Seems really weird to me....

Thanks in advance
Cyrille 

At Friday, 29/03/2013 on 10:06 Duverne, Cyrille wrote:

Hello Ruben !

Thanks for this feedback.

I tried to restart libvirt, which succeeded (WOW ! :p)

But the VMs are still stuck on Unknown state.

the 'virsh list' shows correctly the domains, which are running :

virsh list
 Id Name                 State
----------------------------------
  1 one-294              running
  2 one-304              running

Any other thought ? I'm a bit confused by this behaviour and the
workflow to monitor the VMs, it could be interesting to have a
'refresh monitoring' button or whatever on Sunstone to try to get
fresh monitoring information.

Thanks in advance
Cyrille

 "Always do right.. This will gratify some people and astonish the
rest."
Mark Twain

At Thursday, 28/03/2013 on 0:56 Ruben S. Montero wrote:

Ok 

So this is strange... 

On one hand you try to restart the VM and virsh says it is already
defined (vm.log: main 'one-294' already exists) . And on the other
hand when you monitor the VM virsh list does not show it (oned.log:
POLL SUCCESS 294 STATE=d)

Is the domain really defined at the host (virsh list)? Can this be a
libvirt issue, any chance to restart libvirt and try again?

Cheers

Ruben

On Tue, Mar 26, 2013 at 10:37 PM, Duverne, Cyrille  wrote:

Hello Ruben,

Indeed this happens for some of them, but for some others they are
still in UNKNOWs state.
Here is an extract of the VM log :

"Thu Mar 21 11:55:56 2013 [LCM][I]: New VM state is SAVE_SUSPENDThu
Mar 21 11:57:49 2013 [VMM][I]: ExitCode: 0 Thu Mar 21 11:57:49 2013
[VMM][I]: Successfully execute virtualization driver operation: save.
Thu Mar 21 11:57:50 2013 [VMM][I]: ExitCode: 0 Thu Mar 21 11:57:50
2013 [VMM][I]: Successfully execute network driver operation: clean.
Thu Mar 21 11:57:50 2013 [DiM][I]: New VM state is SUSPENDED Tue Mar
26 17:27:48 2013 [DiM][I]: New VM state is ACTIVE. Tue Mar 26 17:27:48
2013 [LCM][I]: Restoring VM Tue Mar 26 17:27:48 2013 [LCM][I]: New
state is BOOT_SUSPENDED Tue Mar 26 17:27:49 2013 [VMM][I]: ExitCode: 0
Tue Mar 26 17:27:49 2013 [VMM][I]: Successfully execute network driver
operation: pre. Tue Mar 26 17:28:37 2013 [VMM][I]: ExitCode: 0 Tue Mar
26 17:28:37 2013 [VMM][I]: Successfully execute virtualization driver
operation: restore. Tue Mar 26 17:28:37 2013 [VMM][I]: ExitCode: 0 Tue
Mar 26 17:28:37 2013 [VMM][I]: Successfully execute network driver
operation: post. Tue Mar 26 17:28:38 2013 [LCM][I]: New VM state is
RUNNING Tue Mar 26 17:28:38 2013 [VMM][I]: ExitCode: 0 Tue Mar 26
17:28:39 2013 [VMM][I]: VM running but it was not found. Restart and
delete actions available or try to recover it manually Tue Mar 26
17:28:39 2013 [LCM][I]: New VM state is UNKNOWN Tue Mar 26 17:36:48
2013 [LCM][I]: New VM state is BOOT_UNKNOWN Tue Mar 26 17:36:48 2013
[VMM][I]: Generating deployment file: /var/lib/one/294/deployment.1
Tue Mar 26 17:36:52 2013 [VMM][I]: ExitCode: 0 Tue Mar 26 17:36:52
2013 [VMM][I]: Successfully execute network driver operation: pre. Tue
Mar 26 17:36:52 2013 [VMM][I]: Command execution fail: cat 

Links:
------
[1] http://www.OpenNebula.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.opennebula.org/pipermail/users-opennebula.org/attachments/20130403/0034e6e0/attachment-0002.htm>


More information about the Users mailing list