[one-users] VMs stuck in PEND state

Wed Nov 10 08:39:38 PST 2010

Hello,

After applying this patch and adding host again i had a couple of different errors ( like ruby not being installed in the node ) which i was able to fix. But now i got stuck again during the VM startup, at the log i see the following error:

Wed Nov 10 17:22:38 2010 [LCM][I]: New VM state is BOOT
Wed Nov 10 17:22:38 2010 [VMM][I]: Generating deployment file: /var/lib/one/5/deployment.0
Wed Nov 10 17:22:38 2010 [VMM][E]: No kernel or bootloader defined and no default provided.
Wed Nov 10 17:22:38 2010 [VMM][E]: deploy_action, error generating deployment file: /var/lib/one/5/deployment.0
Wed Nov 10 17:22:38 2010 [DiM][I]: New VM state is FAILED
Wed Nov 10 17:22:38 2010 [TM][W]: Ignored: LOG - 5 tm_delete.sh: Deleting /var/lib/one//5/images
Wed Nov 10 17:22:38 2010 [TM][W]: Ignored: LOG - 5 tm_delete.sh: Executed "rm -rf /var/lib/one//5/images".
Wed Nov 10 17:22:38 2010 [TM][W]: Ignored: TRANSFER SUCCESS 5 -

The thread that i found here at list about this doesn't contain a solution for this problem, so i'm not sure how to proceed.

Regards,

Fernando.

Em 10/11/2010, às 11:49, opennebula at nerling.ch escreveu:

> Yes, it is related to the bug http://dev.opennebula.org/issues/385.
> I attached a patch for /usr/lib/one/mads/one_im_ssh.rb.
> ## Save it on the /tmp
> #cd /usr/lib/one/mads/
> #patch -p0 < /tmp/one_im_ssh.rb.patch
> 
> Quoting Fernando Morgenstern <fernando at consultorpc.com>:
> 
>> Hi,
>> 
>> The tmp folder in my host is empty.
>> 
>> Here is the output of the commands:
>> 
>> $ onehost show 1
>> HOST 1 INFORMATION
>> ID                    : 1
>> NAME                  : node01
>> CLUSTER               : default
>> STATE                 : ERROR
>> IM_MAD                : im_xen
>> VM_MAD                : vmm_xen
>> TM_MAD                : tm_nfs
>> 
>> HOST SHARES
>> MAX MEM               : 0
>> USED MEM (REAL)       : 0
>> USED MEM (ALLOCATED)  : 0
>> MAX CPU               : 0
>> USED CPU (REAL)       : 0
>> USED CPU (ALLOCATED)  : 0
>> RUNNING VMS           : 0
>> 
>> MONITORING INFORMATION
>> 
>> $ onehost show -x 1
>> <HOST>
>>  <ID>1</ID>
>>  <NAME>node01</NAME>
>>  <STATE>3</STATE>
>>  <IM_MAD>im_xen</IM_MAD>
>>  <VM_MAD>vmm_xen</VM_MAD>
>>  <TM_MAD>tm_nfs</TM_MAD>
>>  <LAST_MON_TIME>1289394482</LAST_MON_TIME>
>>  <CLUSTER>default</CLUSTER>
>>  <HOST_SHARE>
>>    <HID>1</HID>
>>    <DISK_USAGE>0</DISK_USAGE>
>>    <MEM_USAGE>0</MEM_USAGE>
>>    <CPU_USAGE>0</CPU_USAGE>
>>    <MAX_DISK>0</MAX_DISK>
>>    <MAX_MEM>0</MAX_MEM>
>>    <MAX_CPU>0</MAX_CPU>
>>    <FREE_DISK>0</FREE_DISK>
>>    <FREE_MEM>0</FREE_MEM>
>>    <FREE_CPU>0</FREE_CPU>
>>    <USED_DISK>0</USED_DISK>
>>    <USED_MEM>0</USED_MEM>
>>    <USED_CPU>0</USED_CPU>
>>    <RUNNING_VMS>0</RUNNING_VMS>
>>  </HOST_SHARE>
>>  <TEMPLATE/>
>> </HOST>
>> 
>> Thanks again for your answers.
>> 
>> 
>> Em 10/11/2010, às 11:22, opennebula at nerling.ch escreveu:
>> 
>>> Hallo Fernando.
>>> try to log in the host and look if there is a folder names /tmp/one.
>>> If not it could be related to the bug: http://dev.opennebula.org/issues/385
>>> 
>>> please post the output from:
>>> #onehost show 1
>>> #onehost show -x 1
>>> 
>>> I thought before your host has an id of 0.
>>> 
>>> Marlon Nerling
>>> 
>>> Quoting Fernando Morgenstern <fernando at consultorpc.com>:
>>> 
>>>> Hello,
>>>> 
>>>> Thanks for the answer.
>>>> 
>>>> You are right, the host is showing an error state and i didn't verified it. How can i know what is causing the error in host?
>>>> 
>>>> $ onehost list
>>>> ID NAME              CLUSTER  RVM   TCPU   FCPU   ACPU    TMEM    FMEM STAT
>>>>  1 node01            default    0      0      0    100      0K      0K  err
>>>> 
>>>> $ onevm show 0
>>>> VIRTUAL MACHINE 0 INFORMATION
>>>> ID             : 0
>>>> NAME           : ttylinux
>>>> STATE          : DONE
>>>> LCM_STATE      : LCM_INIT
>>>> START TIME     : 11/09 19:06:37
>>>> END TIME       : 11/09 19:11:09
>>>> DEPLOY ID:     : -
>>>> 
>>>> VIRTUAL MACHINE MONITORING
>>>> NET_RX         : 0
>>>> USED MEMORY    : 0
>>>> USED CPU       : 0
>>>> NET_TX         : 0
>>>> 
>>>> VIRTUAL MACHINE TEMPLATE
>>>> CPU=0.1
>>>> DISK=[
>>>> DISK_ID=0,
>>>> READONLY=no,
>>>> SOURCE=/home/oneadmin/ttylinux.img,
>>>> TARGET=hda ]
>>>> FEATURES=[
>>>> ACPI=no ]
>>>> MEMORY=64
>>>> NAME=ttylinux
>>>> NIC=[
>>>> BRIDGE=br0,
>>>> IP=*****,
>>>> MAC=02:00:5d:9f:d0:68,
>>>> NETWORK=Small network,
>>>> NETWORK_ID=0 ]
>>>> VMID=0
>>>> 
>>>> $ onehost show 0
>>>> Error: [HostInfo] Error getting HOST [0].
>>>> 
>>>> Thanks!
>>>> 
>>>> Em 10/11/2010, às 06:45, opennebula at nerling.ch escreveu:
>>>> 
>>>>> Hallo Fernando.
>>>>> Could you please post the output of:
>>>>> #onehost list
>>>>> #onevm show 0
>>>>> #onehost show 0
>>>>> 
>>>>> It seems that none of your Hosts are enabled!
>>>>> Tue Nov  9 20:31:18 2010 [HOST][D]: Discovered Hosts (enabled):
>>>>> 
>>>>> best regards
>>>>> 
>>>>> Marlon Nerling
>>>>> 
>>>>> Zitat von Fernando Morgenstern <fernando at consultorpc.com>:
>>>>> 
>>>>>> Hello,
>>>>>> 
>>>>>> This is the first time that i'm using open nebula, so i tried to do it with express script which ran fine. I'm using CentOS 5.5 with Xen.
>>>>>> 
>>>>>> The first thing that i'm trying to do is getting the following vm running:
>>>>>> 
>>>>>> NAME   = ttylinux
>>>>>> CPU    = 0.1
>>>>>> MEMORY = 64
>>>>>> 
>>>>>> DISK   = [
>>>>>> source   = "/var/lib/one/images/ttylinux.img",
>>>>>> target   = "hda",
>>>>>> readonly = "no" ]
>>>>>> 
>>>>>> NIC    = [ NETWORK = "Small network" ]
>>>>>> 
>>>>>> FEATURES=[ acpi="no" ]
>>>>>> 
>>>>>> 
>>>>>> So i used:
>>>>>> 
>>>>>> onevm create ttylinux.one
>>>>>> 
>>>>>> The issue is that it keeps stuck in PEND state:
>>>>>> 
>>>>>> onevm list
>>>>>> ID     USER     NAME STAT CPU     MEM        HOSTNAME        TIME
>>>>>>  3 oneadmin ttylinux pend   0      0K                 00 00:20:03
>>>>>> 
>>>>>> 
>>>>>> At oned log the following messages keeps repeating:
>>>>>> 
>>>>>> Tue Nov  9 20:29:48 2010 [ReM][D]: VirtualMachinePoolInfo method invoked
>>>>>> Tue Nov  9 20:29:59 2010 [ReM][D]: VirtualMachinePoolInfo method invoked
>>>>>> Tue Nov  9 20:30:18 2010 [ReM][D]: HostPoolInfo method invoked
>>>>>> Tue Nov  9 20:30:18 2010 [ReM][D]: VirtualMachinePoolInfo method invoked
>>>>>> 
>>>>>> And at sched.log, i see this:
>>>>>> 
>>>>>> Tue Nov  9 20:31:18 2010 [HOST][D]: Discovered Hosts (enabled):
>>>>>> Tue Nov  9 20:31:18 2010 [VM][D]: Pending virtual machines : 3
>>>>>> Tue Nov  9 20:31:18 2010 [RANK][W]: No rank defined for VM
>>>>>> Tue Nov  9 20:31:18 2010 [SCHED][I]: Select hosts
>>>>>> 	PRI	HID
>>>>>> 	-------------------
>>>>>> Virtual Machine: 3
>>>>>> 
>>>>>> 
>>>>>> Any ideas of what might be happening?
>>>>>> 
>>>>>> I saw other threads here at the list with similar problems, but their solution didn't applied to my case.
>>>>>> 
>>>>>> Best Regards,
>>>>>> 
>>>>>> Fernando.
>>>>>> _______________________________________________
>>>>>> Users mailing list
>>>>>> Users at lists.opennebula.org
>>>>>> http://lists.opennebula.org/listinfo.cgi/users-opennebula.org
>>>>>> 
>>>>> 
>>>>> 
>>>>> _______________________________________________
>>>>> Users mailing list
>>>>> Users at lists.opennebula.org
>>>>> http://lists.opennebula.org/listinfo.cgi/users-opennebula.org
>>>> 
>>>> _______________________________________________
>>>> Users mailing list
>>>> Users at lists.opennebula.org
>>>> http://lists.opennebula.org/listinfo.cgi/users-opennebula.org
>>>> 
>>> 
>>> _______________________________________________
>>> Users mailing list
>>> Users at lists.opennebula.org
>>> http://lists.opennebula.org/listinfo.cgi/users-opennebula.org
>> 
>> _______________________________________________
>> Users mailing list
>> Users at lists.opennebula.org
>> http://lists.opennebula.org/listinfo.cgi/users-opennebula.org
>> 
> 
> <one_im_ssh.rb.patch>_______________________________________________
> Users mailing list
> Users at lists.opennebula.org
> http://lists.opennebula.org/listinfo.cgi/users-opennebula.org