[one-users] VMs stuck in UNKNOWN State

Ruben S. Montero rsmontero at opennebula.org
Tue Apr 2 13:31:29 PDT 2013


So the VMs are now running, and correctly reported by libvirt, but
OpenNebula does not move them from UNKNOWN to RUNNING?, Are the messages
still reporting STATE=d for these VMs in oned.log?

Ruben


On Tue, Apr 2, 2013 at 3:57 PM, Duverne, Cyrille <
cyrille.duverne at euranova.eu> wrote:

> Hello,
>
> Anything new on this ?
>
> Seems really weird to me...
>
> Thanks in advance
> Cyrille
>
>
>
>
> At Friday, 29/03/2013 on 10:06 Duverne, Cyrille wrote:
>
> Hello Ruben !
>
> Thanks for this feedback.
>
> I tried to restart libvirt, which succeeded (WOW ! [image: :p])
>
>
> But the VMs are still stuck on Unknown state.
>
> the 'virsh list' shows correctly the domains, which are running :
>
> virsh list
>  Id Name                 State
> ----------------------------------
>   1 one-294              running
>   2 one-304              running
>
> Any other thought ? I'm a bit confused by this behaviour and the workflow
> to monitor the VMs, it could be interesting to have a 'refresh monitoring'
> button or whatever on Sunstone to try to get fresh monitoring information.
>
> Thanks in advance
> Cyrille
>
> "Always do right. This will gratify some people and astonish the rest."
> Mark Twain
>
>
>
> At Thursday, 28/03/2013 on 0:56 Ruben S. Montero wrote:
>
> Ok
>
> So this is strange...
>
> On one hand you try to restart the VM and virsh says it is already defined
> (vm.log: main 'one-294' already exists) . And on the other hand when you
> monitor the VM virsh list does not show it (oned.log: POLL SUCCESS 294
> STATE=d)
>
> Is the domain really defined at the host (virsh list)? Can this be a
> libvirt issue, any chance to restart libvirt and try again?
>
>
> Cheers
>
> Ruben
>
>
>
> On Tue, Mar 26, 2013 at 10:37 PM, Duverne, Cyrille <
> cyrille.duverne at euranova.eu> wrote:
>
>> Hello Ruben,
>>
>> Indeed this happens for some of them, but for some others they are still
>> in UNKNOWs state.
>> Here is an extract of the VM log :
>>
>> "Thu Mar 21 11:55:56 2013 [LCM][I]: New VM state is SAVE_SUSPEND
>>
>> Thu Mar 21 11:57:49 2013 [VMM][I]: ExitCode: 0
>> Thu Mar 21 11:57:49 2013 [VMM][I]: Successfully execute virtualization driver operation: save.
>> Thu Mar 21 11:57:50 2013 [VMM][I]: ExitCode: 0
>> Thu Mar 21 11:57:50 2013 [VMM][I]: Successfully execute network driver operation: clean.
>> Thu Mar 21 11:57:50 2013 [DiM][I]: New VM state is SUSPENDED
>> Tue Mar 26 17:27:48 2013 [DiM][I]: New VM state is ACTIVE.
>> Tue Mar 26 17:27:48 2013 [LCM][I]: Restoring VM
>> Tue Mar 26 17:27:48 2013 [LCM][I]: New state is BOOT_SUSPENDED
>> Tue Mar 26 17:27:49 2013 [VMM][I]: ExitCode: 0
>> Tue Mar 26 17:27:49 2013 [VMM][I]: Successfully execute network driver operation: pre.
>> Tue Mar 26 17:28:37 2013 [VMM][I]: ExitCode: 0
>> Tue Mar 26 17:28:37 2013 [VMM][I]: Successfully execute virtualization driver operation: restore.
>> Tue Mar 26 17:28:37 2013 [VMM][I]: ExitCode: 0
>> Tue Mar 26 17:28:37 2013 [VMM][I]: Successfully execute network driver operation: post.
>> Tue Mar 26 17:28:38 2013 [LCM][I]: New VM state is RUNNING
>> Tue Mar 26 17:28:38 2013 [VMM][I]: ExitCode: 0
>> Tue Mar 26 17:28:39 2013 [VMM][I]: VM running but it was not found. Restart and delete actions available or try to recover it manually
>> Tue Mar 26 17:28:39 2013 [LCM][I]: New VM state is UNKNOWN
>> Tue Mar 26 17:36:48 2013 [LCM][I]: New VM state is BOOT_UNKNOWN
>> Tue Mar 26 17:36:48 2013 [VMM][I]: Generating deployment file: /var/lib/one/294/deployment.1
>> Tue Mar 26 17:36:52 2013 [VMM][I]: ExitCode: 0
>> Tue Mar 26 17:36:52 2013 [VMM][I]: Successfully execute network driver operation: pre.
>> Tue Mar 26 17:36:52 2013 [VMM][I]: Command execution fail: cat << EOT | /var/tmp/one/vmm/kvm/deploy /var/lib/one/datastores/0/294/deployment.1 whitefall.local 294 whitefall.local
>> Tue Mar 26 17:36:52 2013 [VMM][I]: error: Failed to create domain from /var/lib/one/datastores/0/294/deployment.1
>> Tue Mar 26 17:36:52 2013 [VMM][I]: error: operation failed: domain 'one-294' already exists with uuid 326bc42b-1f8a-8984-e610-4c35f0bdd56fTue Mar 26 17:36:52 2013 [VMM][E]: Could not create domain from /var/lib/one/datastores/0/294/deployment.1
>> Tue Mar 26 17:36:52 2013 [VMM][I]: ExitCode: 255
>> Tue Mar 26 17:36:52 2013 [VMM][I]: Failed to execute virtualization driver operation: deploy.Tue Mar 26 17:36:52 2013 [VMM][E]: Error deploying virtual machine: Could not create domain from /var/lib/one/datastores/0/294/deployment.1
>> Tue Mar 26 17:36:52 2013 [LCM][I]: Fail to boot VM. New VM state is UNKNOWN
>> Tue Mar 26 17:37:21 2013 [LCM][I]: New VM state is BOOT_UNKNOWN
>> Tue Mar 26 17:37:21 2013 [VMM][I]: Generating deployment file: /var/lib/one/294/deployment.1
>> Tue Mar 26 17:37:22 2013 [VMM][I]: ExitCode: 0
>> Tue Mar 26 17:37:22 2013 [VMM][I]: Successfully execute network driver operation: pre.
>> Tue Mar 26 17:37:22 2013 [VMM][I]: Command execution fail: cat << EOT | /var/tmp/one/vmm/kvm/deploy /var/lib/one/datastores/0/294/deployment.1 whitefall.local 294 whitefall.local
>> Tue Mar 26 17:37:22 2013 [VMM][I]: error: Failed to create domain from /var/lib/one/datastores/0/294/deployment.1
>> Tue Mar 26 17:37:22 2013 [VMM][I]: error: operation failed: domain 'one-294' already exists with uuid 326bc42b-1f8a-8984-e610-4c35f0bdd56fTue Mar 26 17:37:22 2013 [VMM][E]: Could not create domain from /var/lib/one/datastores/0/294/deployment.1
>> Tue Mar 26 17:37:22 2013 [VMM][I]: ExitCode: 255
>> Tue Mar 26 17:37:22 2013 [VMM][I]: Failed to execute virtualization driver operation: deploy.Tue Mar 26 17:37:22 2013 [VMM][E]: Error deploying virtual machine: Could not create domain from /var/lib/one/datastores/0/294/deployment.1
>> Tue Mar 26 17:37:23 2013 [LCM][I]: Fail to boot VM. New VM state is UNKNOWN
>> Tue Mar 26 17:38:39 2013 [VMM][I]: ExitCode: 0
>> Tue Mar 26 17:38:41 2013 [VMM][I]: VM running but it was not found. Restart and delete actions available or try to recover it manually
>> Tue Mar 26 17:48:45 2013 [VMM][I]: ExitCode: 0
>> Tue Mar 26 17:48:45 2013 [VMM][I]: VM running but it was not found. Restart and delete actions available or try to recover it manually
>> Tue Mar 26 17:58:45 2013 [VMM][I]: ExitCode: 0
>> Tue Mar 26 17:58:45 2013 [VMM][I]: VM running but it was not found. Restart and delete actions available or try to recover it manually
>>
>> Tue Mar 26 18:08:45 2013 [VMM][I]: ExitCode: 0"
>>
>> The RESTART didn't do anything.
>>
>> Here is the oned.log's extract for the same VM :
>>
>> "Tue Mar 26 22:18:45 2013 [VMM][I]: Monitoring VM 294.
>> Tue Mar 26 22:18:45 2013 [VMM][D]: Message received: LOG I 294 ExitCode: 0
>> Tue Mar 26 22:18:45 2013 [VMM][D]: Message received: POLL SUCCESS 294
>> STATE=d"
>>
>> The VMs that are in UNKNOWN state are located on 2 different hosts.
>> All hosts are configurated in the same way.
>>
>> Thanks in advance
>> Cyrille
>>
>>
>> At Tuesday, 26/03/2013 on 18:53 Ruben S. Montero wrote:
>>
>> They should appear after a while, when the VM is monitored... Look for
>> messages Monitoring VM... in oned.log.
>>
>> Cheers
>>
>> Ruben
>>
>>
>> On Tue, Mar 26, 2013 at 5:39 PM, Duverne, Cyrille <
>> cyrille.duverne at euranova.eu> wrote:
>>
>>> Hello,
>>>
>>> I just finished the reboot of our lab after electric shutdown,
>>> everything went fine.
>>>
>>> But some of the VMs are stuck in UNKNOWN state after resuming them.
>>> I tried to restart them, but they are actually running on the
>>> Hypervisors, it's just that sunstone is displaying UNKNOWN.
>>>
>>> Any thought to solve this ?
>>>
>>> Thanks in advance
>>> Cyrille
>>>
>>>
>>> _______________________________________________
>>> Users mailing list
>>> Users at lists.opennebula.org
>>> http://lists.opennebula.org/listinfo.cgi/users-opennebula.org
>>>
>>>
>>
>>
>> --
>> Ruben S. Montero, PhD
>> Project co-Lead and Chief Architect
>> OpenNebula - The Open Source Solution for Data Center Virtualization
>> www.OpenNebula.org | rsmontero at opennebula.org | @OpenNebula
>>
>>
>
>
> --
> Ruben S. Montero, PhD
> Project co-Lead and Chief Architect
> OpenNebula - The Open Source Solution for Data Center Virtualization
> www.OpenNebula.org | rsmontero at opennebula.org | @OpenNebula
>
>


-- 
Ruben S. Montero, PhD
Project co-Lead and Chief Architect
OpenNebula - The Open Source Solution for Data Center Virtualization
www.OpenNebula.org | rsmontero at opennebula.org | @OpenNebula
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.opennebula.org/pipermail/users-opennebula.org/attachments/20130402/72ac6c4d/attachment-0002.htm>


More information about the Users mailing list