[one-users] VM stuck in boot state
Karthik Mallavarapu
karthik.mallavarapu at gmail.com
Thu May 12 05:26:05 PDT 2011
Hello Javier,
The issue comes when we try to migrate a VM onto this particular node. I
could deploy the VM successfully now because I had rebooted the problematic
node machine earlier. But now when I tried to migrate a VM from another node
to this particular node, the VM gets stuck in BOOT state. The following is
the log of that particular VM.
*2011-05-12 15:16:38.216: starting up*
*LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin
/usr/bin/kvm -S -M pc-0.12 -enable-kvm -m 600 -smp
1,sockets=1,cores=1,threads=1 -name one-75 -uuid
71849f73-851b-53f9-0176-a26de9466338 -nographic -nodefaults -chardev
socket,id=charmonitor,path=/var/lib/libvirt/qemu/one-75.monitor,server,nowait
-mon chardev=charmonitor,id=monitor,mode=readline -rtc base=utc -boot c
-device lsi,id=scsi0,bus=pci.0,addr=0x3 -drive
file=/srv/cloud/one/var//75/images/disk.0,if=none,id=drive-scsi0-0-0,boot=on,format=qcow2
-device scsi-disk,bus=scsi0.0,scsi-id=0,drive=drive-scsi0-0-0,id=scsi0-0-0
-device rtl8139,vlan=0,id=net0,mac=02:00:c0:a8:02:0b,bus=pci.0,addr=0x2 -net
tap,fd=19,vlan=0,name=hostnet0 -usb -incoming fd:15 -device
virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x4*
*qemu: warning: error while loading state for instance 0x0 of device 'ram'*
*load of migration failed*
*
*
And then I had disabled all the other nodes except for the problematic node
and deployed a VM after this migration issue. The VM again gets stuck in
BOOT state and strangely there is no log file created for the new VM. It is
now very clear that there is an issue with migration and it is causing even
the other VMs to fail.
Any idea why this might happen or what the error means.
I really appreciate your time and help. Thanks a lot.
Regards,
Karthik
On Thu, May 12, 2011 at 2:32 PM, Javier Fontan <jfontan at gmail.com> wrote:
> Can you check /var/log/libvirt/qemu/one-54.log? There should be some info
> there.
>
> On Tue, May 10, 2011 at 5:26 PM, Karthik Mallavarapu
> <karthik.mallavarapu at gmail.com> wrote:
> > Hello All,
> > I have an opennebula installation with 1 frontend and two nodes. The base
> OS
> > is ubuntu 10.10 64 bit edition. When I try to deploy a VM using the
> command
> > onevm create ubuntu.template the VM gets deployed successfully on one
> node
> > but it gets stuck in BOOT state on the other node. I am enclosing
> > the log file and deployment.0 file of that VM and transcript of the log
> from
> > $ONE_LOCATION/var/oned.log.
> >
> > vm.log
> > Tue May 10 18:01:54 2011 [DiM][I]: New VM state is ACTIVE.
> > Tue May 10 18:01:54 2011 [LCM][I]: New VM state is PROLOG.
> > Tue May 10 18:01:54 2011 [VM][I]: Virtual Machine has no context
> > Tue May 10 18:01:55 2011 [TM][I]: tm_clone.sh:
> > cloud-3:/srv/cloud/images/ubuntu/ubu64.img
> > 192.168.2.5:/srv/cloud/one/var//54/images/disk.0
> > Tue May 10 18:01:55 2011 [TM][I]: tm_clone.sh: DST:
> > /srv/cloud/one/var//54/images/disk.0
> > Tue May 10 18:01:55 2011 [TM][I]: tm_clone.sh: Creating directory
> > /srv/cloud/one/var//54/images
> > Tue May 10 18:01:55 2011 [TM][I]: tm_clone.sh: Executed "mkdir -p
> > /srv/cloud/one/var//54/images".
> > Tue May 10 18:01:55 2011 [TM][I]: tm_clone.sh: Executed "chmod a+w
> > /srv/cloud/one/var//54/images".
> > Tue May 10 18:01:55 2011 [TM][I]: tm_clone.sh: Cloning
> > /srv/cloud/images/ubuntu/ubu64.img
> > Tue May 10 18:01:55 2011 [TM][I]: tm_clone.sh: Executed "cp -r
> > /srv/cloud/images/ubuntu/ubu64.img /srv/cloud/one/var//54/images/disk.0".
> > Tue May 10 18:01:55 2011 [TM][I]: tm_clone.sh: Executed "chmod a+rw
> > /srv/cloud/one/var//54/images/disk.0".
> > Tue May 10 18:01:57 2011 [LCM][I]: New VM state is BOOT
> > Tue May 10 18:01:57 2011 [VMM][I]: Generating deployment file:
> > /srv/cloud/one/var/54/deployment.0
> > deployment.0
> > <domain type='kvm'>
> > <name>one-54</name>
> > <memory>614400</memory>
> > <os>
> > <type arch='x86_64'>hvm</type>
> > <boot dev='hd'/>
> > </os>
> > <devices>
> > <emulator>/usr/bin/kvm</emulator>
> > <disk type='file' device='disk'>
> > <source
> > file='/srv/cloud/one/var//54/images/disk.0'/>
> > <target dev='sda'/>
> > <driver name='qemu' type='qcow2'/>
> > </disk>
> > <interface type='bridge'>
> > <source bridge='br0'/>
> > <mac address='02:00:c0:a8:02:03'/>
> > </interface>
> > </devices>
> > <features>
> > <acpi/>
> > </features>
> > </domain>
> > /srv/cloud/one/var/oned.log
> > Tue May 10 18:01:54 2011 [DiM][D]: Deploying VM 54
> > Tue May 10 18:01:55 2011 [TM][D]: Message received: LOG - 54 tm_clone.sh:
> > cloud-3:/srv/cloud/images/ubuntu/ubu64.img
> > 192.168.2.5:/srv/cloud/one/var//54/images/disk.0
> > Tue May 10 18:01:55 2011 [TM][D]: Message received: LOG - 54 tm_clone.sh:
> > DST: /srv/cloud/one/var//54/images/disk.0
> > Tue May 10 18:01:55 2011 [TM][D]: Message received: LOG - 54 tm_clone.sh:
> > Creating directory /srv/cloud/one/var//54/images
> > Tue May 10 18:01:55 2011 [TM][D]: Message received: LOG - 54 tm_clone.sh:
> > Executed "mkdir -p /srv/cloud/one/var//54/images".
> > Tue May 10 18:01:55 2011 [TM][D]: Message received: LOG - 54 tm_clone.sh:
> > Executed "chmod a+w /srv/cloud/one/var//54/images".
> > Tue May 10 18:01:55 2011 [TM][D]: Message received: LOG - 54 tm_clone.sh:
> > Cloning /srv/cloud/images/ubuntu/ubu64.img
> > Tue May 10 18:01:55 2011 [TM][D]: Message received: LOG - 54 tm_clone.sh:
> > Executed "cp -r /srv/cloud/images/ubuntu/ubu64.img
> > /srv/cloud/one/var//54/images/disk.0".
> > Tue May 10 18:01:55 2011 [TM][D]: Message received: LOG - 54 tm_clone.sh:
> > Executed "chmod a+rw /srv/cloud/one/var//54/images/disk.0".
> > Tue May 10 18:01:55 2011 [TM][D]: Message received: TRANSFER SUCCESS 54 -
> > onevm show command for that particular VM gives the following output.
> > VIRTUAL MACHINE 54 INFORMATION
> >
> > ID : 54
> > NAME : ubuntu
> > STATE : ACTIVE
> > LCM_STATE : BOOT
> > START TIME : 05/10 18:01:39
> > END TIME : -
> > DEPLOY ID: : -
> > The strange thing with the issue at hand is that, we had successfully
> > deployed VMs before on this particular node. But recently we had upgraded
> > the libvirt version to 0.9 on both
> > the nodes. This particular node was operational even after the libvirt
> > version upgrade. Deployment on this node stopped working from the time we
> > upgraded the libvirt version
> > of the 2nd node. By the way, the host monitoring seems to be working as
> it
> > is evident from the oned.log file.
> > Could some one please throw some light on this issue. I have tried to dig
> up
> > the old threads but the suggested fixes did not really work in my case.
> > Thanks a lot for your time.
> > Regards,
> > Karthik
> > _______________________________________________
> > Users mailing list
> > Users at lists.opennebula.org
> > http://lists.opennebula.org/listinfo.cgi/users-opennebula.org
> >
> >
>
>
>
> --
> Javier Fontan, Grid & Virtualization Technology Engineer/Researcher
> DSA Research Group: http://dsa-research.org
> Globus GridWay Metascheduler: http://www.GridWay.org
> OpenNebula Virtual Infrastructure Engine: http://www.OpenNebula.org
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.opennebula.org/pipermail/users-opennebula.org/attachments/20110512/2ec449c9/attachment-0002.htm>
More information about the Users
mailing list