Hi Javier,<div><br></div><div>There is no kvm version mismatch. I had to downgrade libvirt installation from 0.9 to 0.8 in order to fix the migrate issue. </div><div>Thanks a lot.</div><div><br></div><div>Regards,</div><div>
Karthik<br><br><div class="gmail_quote">On Wed, May 18, 2011 at 6:53 PM, Javier Fontan <span dir="ltr"><<a href="mailto:jfontan@gmail.com">jfontan@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">
Could this be the problem?<br>
<br>
<a href="https://bugs.launchpad.net/ubuntu/+source/kvm/+bug/244467" target="_blank">https://bugs.launchpad.net/ubuntu/+source/kvm/+bug/244467</a><br>
<br>
Also, is kvm the same version in both hosts. Maybe a kvm version<br>
mismatch can cause that problem.<br>
<br>
On Thu, May 12, 2011 at 2:26 PM, Karthik Mallavarapu<br>
<div><div></div><div class="h5"><<a href="mailto:karthik.mallavarapu@gmail.com">karthik.mallavarapu@gmail.com</a>> wrote:<br>
> Hello Javier,<br>
> The issue comes when we try to migrate a VM onto this particular node. I<br>
> could deploy the VM successfully now because I had rebooted the problematic<br>
> node machine earlier. But now when I tried to migrate a VM from another node<br>
> to this particular node, the VM gets stuck in BOOT state. The following is<br>
> the log of that particular VM.<br>
> 2011-05-12 15:16:38.216: starting up<br>
> LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin<br>
> /usr/bin/kvm -S -M pc-0.12 -enable-kvm -m 600 -smp<br>
> 1,sockets=1,cores=1,threads=1 -name one-75 -uuid<br>
> 71849f73-851b-53f9-0176-a26de9466338 -nographic -nodefaults -chardev<br>
> socket,id=charmonitor,path=/var/lib/libvirt/qemu/one-75.monitor,server,nowait<br>
> -mon chardev=charmonitor,id=monitor,mode=readline -rtc base=utc -boot c<br>
> -device lsi,id=scsi0,bus=pci.0,addr=0x3 -drive<br>
> file=/srv/cloud/one/var//75/images/disk.0,if=none,id=drive-scsi0-0-0,boot=on,format=qcow2<br>
> -device scsi-disk,bus=scsi0.0,scsi-id=0,drive=drive-scsi0-0-0,id=scsi0-0-0<br>
> -device rtl8139,vlan=0,id=net0,mac=02:00:c0:a8:02:0b,bus=pci.0,addr=0x2 -net<br>
> tap,fd=19,vlan=0,name=hostnet0 -usb -incoming fd:15 -device<br>
> virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x4<br>
> qemu: warning: error while loading state for instance 0x0 of device 'ram'<br>
> load of migration failed<br>
> And then I had disabled all the other nodes except for the problematic node<br>
> and deployed a VM after this migration issue. The VM again gets stuck in<br>
> BOOT state and strangely there is no log file created for the new VM. It is<br>
> now very clear that there is an issue with migration and it is causing even<br>
> the other VMs to fail.<br>
> Any idea why this might happen or what the error means.<br>
> I really appreciate your time and help. Thanks a lot.<br>
> Regards,<br>
> Karthik<br>
><br>
> On Thu, May 12, 2011 at 2:32 PM, Javier Fontan <<a href="mailto:jfontan@gmail.com">jfontan@gmail.com</a>> wrote:<br>
>><br>
>> Can you check /var/log/libvirt/qemu/one-54.log? There should be some info<br>
>> there.<br>
>><br>
>> On Tue, May 10, 2011 at 5:26 PM, Karthik Mallavarapu<br>
>> <<a href="mailto:karthik.mallavarapu@gmail.com">karthik.mallavarapu@gmail.com</a>> wrote:<br>
>> > Hello All,<br>
>> > I have an opennebula installation with 1 frontend and two nodes. The<br>
>> > base OS<br>
>> > is ubuntu 10.10 64 bit edition. When I try to deploy a VM using the<br>
>> > command<br>
>> > onevm create ubuntu.template the VM gets deployed successfully on one<br>
>> > node<br>
>> > but it gets stuck in BOOT state on the other node. I am enclosing<br>
>> > the log file and deployment.0 file of that VM and transcript of the log<br>
>> > from<br>
>> > $ONE_LOCATION/var/oned.log.<br>
>> ><br>
>> > vm.log<br>
>> > Tue May 10 18:01:54 2011 [DiM][I]: New VM state is ACTIVE.<br>
>> > Tue May 10 18:01:54 2011 [LCM][I]: New VM state is PROLOG.<br>
>> > Tue May 10 18:01:54 2011 [VM][I]: Virtual Machine has no context<br>
>> > Tue May 10 18:01:55 2011 [TM][I]: tm_clone.sh:<br>
>> > cloud-3:/srv/cloud/images/ubuntu/ubu64.img<br>
>> > 192.168.2.5:/srv/cloud/one/var//54/images/disk.0<br>
>> > Tue May 10 18:01:55 2011 [TM][I]: tm_clone.sh: DST:<br>
>> > /srv/cloud/one/var//54/images/disk.0<br>
>> > Tue May 10 18:01:55 2011 [TM][I]: tm_clone.sh: Creating directory<br>
>> > /srv/cloud/one/var//54/images<br>
>> > Tue May 10 18:01:55 2011 [TM][I]: tm_clone.sh: Executed "mkdir -p<br>
>> > /srv/cloud/one/var//54/images".<br>
>> > Tue May 10 18:01:55 2011 [TM][I]: tm_clone.sh: Executed "chmod a+w<br>
>> > /srv/cloud/one/var//54/images".<br>
>> > Tue May 10 18:01:55 2011 [TM][I]: tm_clone.sh: Cloning<br>
>> > /srv/cloud/images/ubuntu/ubu64.img<br>
>> > Tue May 10 18:01:55 2011 [TM][I]: tm_clone.sh: Executed "cp -r<br>
>> > /srv/cloud/images/ubuntu/ubu64.img<br>
>> > /srv/cloud/one/var//54/images/disk.0".<br>
>> > Tue May 10 18:01:55 2011 [TM][I]: tm_clone.sh: Executed "chmod a+rw<br>
>> > /srv/cloud/one/var//54/images/disk.0".<br>
>> > Tue May 10 18:01:57 2011 [LCM][I]: New VM state is BOOT<br>
>> > Tue May 10 18:01:57 2011 [VMM][I]: Generating deployment file:<br>
>> > /srv/cloud/one/var/54/deployment.0<br>
>> > deployment.0<br>
>> > <domain type='kvm'><br>
>> > <name>one-54</name><br>
>> > <memory>614400</memory><br>
>> > <os><br>
>> > <type arch='x86_64'>hvm</type><br>
>> > <boot dev='hd'/><br>
>> > </os><br>
>> > <devices><br>
>> > <emulator>/usr/bin/kvm</emulator><br>
>> > <disk type='file' device='disk'><br>
>> > <source<br>
>> > file='/srv/cloud/one/var//54/images/disk.0'/><br>
>> > <target dev='sda'/><br>
>> > <driver name='qemu' type='qcow2'/><br>
>> > </disk><br>
>> > <interface type='bridge'><br>
>> > <source bridge='br0'/><br>
>> > <mac address='02:00:c0:a8:02:03'/><br>
>> > </interface><br>
>> > </devices><br>
>> > <features><br>
>> > <acpi/><br>
>> > </features><br>
>> > </domain><br>
>> > /srv/cloud/one/var/oned.log<br>
>> > Tue May 10 18:01:54 2011 [DiM][D]: Deploying VM 54<br>
>> > Tue May 10 18:01:55 2011 [TM][D]: Message received: LOG - 54<br>
>> > tm_clone.sh:<br>
>> > cloud-3:/srv/cloud/images/ubuntu/ubu64.img<br>
>> > 192.168.2.5:/srv/cloud/one/var//54/images/disk.0<br>
>> > Tue May 10 18:01:55 2011 [TM][D]: Message received: LOG - 54<br>
>> > tm_clone.sh:<br>
>> > DST: /srv/cloud/one/var//54/images/disk.0<br>
>> > Tue May 10 18:01:55 2011 [TM][D]: Message received: LOG - 54<br>
>> > tm_clone.sh:<br>
>> > Creating directory /srv/cloud/one/var//54/images<br>
>> > Tue May 10 18:01:55 2011 [TM][D]: Message received: LOG - 54<br>
>> > tm_clone.sh:<br>
>> > Executed "mkdir -p /srv/cloud/one/var//54/images".<br>
>> > Tue May 10 18:01:55 2011 [TM][D]: Message received: LOG - 54<br>
>> > tm_clone.sh:<br>
>> > Executed "chmod a+w /srv/cloud/one/var//54/images".<br>
>> > Tue May 10 18:01:55 2011 [TM][D]: Message received: LOG - 54<br>
>> > tm_clone.sh:<br>
>> > Cloning /srv/cloud/images/ubuntu/ubu64.img<br>
>> > Tue May 10 18:01:55 2011 [TM][D]: Message received: LOG - 54<br>
>> > tm_clone.sh:<br>
>> > Executed "cp -r /srv/cloud/images/ubuntu/ubu64.img<br>
>> > /srv/cloud/one/var//54/images/disk.0".<br>
>> > Tue May 10 18:01:55 2011 [TM][D]: Message received: LOG - 54<br>
>> > tm_clone.sh:<br>
>> > Executed "chmod a+rw /srv/cloud/one/var//54/images/disk.0".<br>
>> > Tue May 10 18:01:55 2011 [TM][D]: Message received: TRANSFER SUCCESS 54<br>
>> > -<br>
>> > onevm show command for that particular VM gives the following output.<br>
>> > VIRTUAL MACHINE 54 INFORMATION<br>
>> ><br>
>> > ID : 54<br>
>> > NAME : ubuntu<br>
>> > STATE : ACTIVE<br>
>> > LCM_STATE : BOOT<br>
>> > START TIME : 05/10 18:01:39<br>
>> > END TIME : -<br>
>> > DEPLOY ID: : -<br>
>> > The strange thing with the issue at hand is that, we had successfully<br>
>> > deployed VMs before on this particular node. But recently we had<br>
>> > upgraded<br>
>> > the libvirt version to 0.9 on both<br>
>> > the nodes. This particular node was operational even after the libvirt<br>
>> > version upgrade. Deployment on this node stopped working from the time<br>
>> > we<br>
>> > upgraded the libvirt version<br>
>> > of the 2nd node. By the way, the host monitoring seems to be working as<br>
>> > it<br>
>> > is evident from the oned.log file.<br>
>> > Could some one please throw some light on this issue. I have tried to<br>
>> > dig up<br>
>> > the old threads but the suggested fixes did not really work in my case.<br>
>> > Thanks a lot for your time.<br>
>> > Regards,<br>
>> > Karthik<br>
>> > _______________________________________________<br>
>> > Users mailing list<br>
>> > <a href="mailto:Users@lists.opennebula.org">Users@lists.opennebula.org</a><br>
>> > <a href="http://lists.opennebula.org/listinfo.cgi/users-opennebula.org" target="_blank">http://lists.opennebula.org/listinfo.cgi/users-opennebula.org</a><br>
>> ><br>
>> ><br>
>><br>
>><br>
>><br>
>> --<br>
>> Javier Fontan, Grid & Virtualization Technology Engineer/Researcher<br>
>> DSA Research Group: <a href="http://dsa-research.org" target="_blank">http://dsa-research.org</a><br>
>> Globus GridWay Metascheduler: <a href="http://www.GridWay.org" target="_blank">http://www.GridWay.org</a><br>
>> OpenNebula Virtual Infrastructure Engine: <a href="http://www.OpenNebula.org" target="_blank">http://www.OpenNebula.org</a><br>
><br>
><br>
<br>
<br>
<br>
</div></div>--<br>
<div><div></div><div class="h5">Javier Fontan, Grid & Virtualization Technology Engineer/Researcher<br>
DSA Research Group: <a href="http://dsa-research.org" target="_blank">http://dsa-research.org</a><br>
Globus GridWay Metascheduler: <a href="http://www.GridWay.org" target="_blank">http://www.GridWay.org</a><br>
OpenNebula Virtual Infrastructure Engine: <a href="http://www.OpenNebula.org" target="_blank">http://www.OpenNebula.org</a><br>
</div></div></blockquote></div><br></div>