[one-users] The virtual machine failure migration!
Jhon Masschelein
jhon.masschelein at sara.nl
Wed Jul 4 23:32:55 PDT 2012
Hi,
Sounds like libvirt/qemu have a problem restarting the VM. Are the
config files for those daemons under /etc/libvirt set up identically on
all your hosts? Did you make sure that the user that starts the kvm
process is able to read en write to the files? (In my install I added
the user "qemu" to the "oneadmin" group to make that work.)
You should have a log file for this particular VM in
/var/log/libvirt/qemu/one-[vm-id].log. What does it say?
You say it happens occasionally, so not always? Does it always fail when
you migrate tot this particular host, or also occasionally? If
occasionally, you have to look for things that change from time to time.
Do you have config files maintained by chef/puppet/cfengine? Do you have
user accounts maintained by ldap/nis? If this host always fails, it must
be bad config of this host; compare it to the other hosts that do work.
Where does the "unable to read from monitor" come from. Is it
opennebula? That is kind of normal: it tries to read the status for the
VM but since it is not up, it fails. But actually, that should not give
you a "connection reset"...
What you can do to test your setup when it fails:
Go to the directory with the files and before you change anything, so a
"virsh create deployment.X" where X is the highest number you can find
in that dir. )In your example, it would be 2). (You will need to be root
for this or you will not be able to use the "system" libvirt space.)
If that "just works" then you have a real strange problem. If it gives
you errors, try to solve them. :)
(If you do not know "virsh", you should read up a bit on it.)
Hope this is a bit helpful.
Jhon
On 07/04/2012 12:59 PM, David wrote:
>
> Hi, All
> I used OpenNebula3.2.1 version.
> When I execute VM migrate operation,Occasionally the VM appears
> a migration failure,
> following log:
> Thu Jun 28 15:16:24 2012 [LCM][I]: New VM state is RUNNING
> Thu Jun 28 15:17:09 2012 [LCM][I]: New VM state is SAVE_MIGRATE
> Thu Jun 28 15:17:42 2012 [VMM][I]: save: Executed "virsh --connect
> qemu:///system [^] save one-383 /one_images/383/images/checkpoint".
> Thu Jun 28 15:17:42 2012 [VMM][I]: ExitCode: 0
> Thu Jun 28 15:17:42 2012 [VMM][I]: Successfully execute virtualization
> driver operation: save.
> Thu Jun 28 15:17:43 2012 [VMM][I]: ExitCode: 0
> Thu Jun 28 15:17:43 2012 [VMM][I]: Successfully execute network driver
> operation: clean.
> Thu Jun 28 15:17:43 2012 [LCM][I]: New VM state is PROLOG_MIGRATE
> Thu Jun 28 15:56:03 2012 [TM][I]: tm_mv.sh: Moving /one_images/383/images
> Thu Jun 28 15:56:03 2012 [TM][I]: tm_mv.sh: Executed "ssh
> compute-56-5.local mkdir -p /one_images/383".
> Thu Jun 28 15:56:03 2012 [TM][I]: tm_mv.sh: Executed "scp -r
> compute-56-4.local:/one_images/383/images
> compute-56-5.local:/one_images/383/images".
> Thu Jun 28 15:56:03 2012 [TM][I]: tm_mv.sh: Executed "ssh
> compute-56-4.local rm -rf /one_images/383/images".
> Thu Jun 28 15:56:03 2012 [TM][I]: ExitCode: 0
> Thu Jun 28 15:56:03 2012 [LCM][I]: New VM state is BOOT
> Thu Jun 28 15:56:05 2012 [VMM][I]: ExitCode: 0
> Thu Jun 28 15:56:05 2012 [VMM][I]: Successfully execute network driver
> operation: pre.
> Thu Jun 28 15:56:06 2012 [VMM][I]: Command execution fail:
> /var/tmp/one/vmm/kvm/restore /one_images/383/images/checkpoint
> compute-56-5.local 383 compute-56-5.local
> Thu Jun 28 15:56:06 2012 [VMM][E]: restore: Command "virsh --connect
> qemu:///system [^] restore /one_images/383/images/checkpoint" failed.
> Thu Jun 28 15:56:06 2012 [VMM][E]: restore: error: Failed to restore
> domain from /one_images/383/images/checkpoint
> Thu Jun 28 15:56:06 2012 [VMM][I]: error: internal error process
> exited while connecting to monitor: qemu-kvm: -drive
> file=/one_images/383/images/disk.0,if=none,id=drive-virtio-disk0,format=raw:
> could not open disk image /one_images/383/images/disk.0: Permission denied
> Thu Jun 28 15:56:06 2012 [VMM][E]: Could not restore from
> /one_images/383/images/checkpoint
> Thu Jun 28 15:56:06 2012 [VMM][I]: ExitCode: 1
> Thu Jun 28 15:56:06 2012 [VMM][I]: Failed to execute virtualization
> driver operation: restore.
> Thu Jun 28 15:56:06 2012 [VMM][E]: Error restoring VM: Could not
> restore from /one_images/383/images/checkpoint
> Thu Jun 28 15:56:06 2012 [DiM][I]: New VM state is FAILED
>
> execute command : chmod +x *
> but ,receive the following error message:
> error: Unable to read from monitor: Connection reset by peer
>
> This is what causes problems ?
> Thanks! Hope after
>
> Regards!
>
>
> _______________________________________________
> Users mailing list
> Users at lists.opennebula.org
> http://lists.opennebula.org/listinfo.cgi/users-opennebula.org
--
Jhon Masschelein
Senior Systeemprogrammeur
SARA - HPCV
Science Park 140
1098 XG Amsterdam
T +31 (0)20 592 8099
F +31 (0)20 668 3167
M +31 (0)6 4748 9328
E jhon.masschelein at sara.nl
http://www.sara.nl
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.opennebula.org/pipermail/users-opennebula.org/attachments/20120705/4ba0cb0f/attachment-0003.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/png
Size: 16933 bytes
Desc: not available
URL: <http://lists.opennebula.org/pipermail/users-opennebula.org/attachments/20120705/4ba0cb0f/attachment-0003.png>
More information about the Users
mailing list