[one-users] wrong restart -> delete disk image!

Javier Fontan jfontan at gmail.com
Mon Sep 12 03:29:32 PDT 2011


If the VM is called the same (one-<vmid>) and is running in the same
host it should change to running after it is monitored. Check
oned.conf for the number of seconds it will take between monitoring
cycles and also check vm.log in the VM directory for any monitoring
attempt.

On Thu, Sep 8, 2011 at 7:33 PM, João Soares <joaosoares at ua.pt> wrote:
> Hi Carlos and Samuel,
>
>
>
> I already had came across with this issue. In my case I tested the front-end
> and the physical host in the same machine, and if for some reason I reboot
> the host, when it comes back up VMs are in state Unknown (and don’t leave
> that state). Then, when I try to restart it with OpenNebula it does this
> magic thing, deletes the VM J But if I backup the xml of the VM (in virsh
> with a dumpxml), reboot the host and then try to launch the VM with virsh
> the VM starts OK, but OpenNebula still sees it as Unknown.
>
>
>
> Cheers,
>
>
>
> João
>
>
>
> From: users-bounces at lists.opennebula.org
> [mailto:users-bounces at lists.opennebula.org] On Behalf Of Carlos Martín
> Sánchez
> Sent: quinta-feira, 8 de Setembro de 2011 15:08
> To: samuel
> Cc: users
> Subject: Re: [one-users] wrong restart -> delete disk image!
>
>
>
> Hi Samuel,
>
> It is safe to change the code, just a couple of comments:
>
> Before stopping OpenNebula, check that there are not any VMs in a transient
> state (migrating, saving, etc.).
> Then stop it, backup /var/lib/one (or $ONE_LOCATION/var), and use the '-k'
> option of install.sh to keep your current etc folder.
>
> Also take into account that we haven't tested the proposed solution.
>
> Regards.
> --
> Carlos Martín, MSc
> Project Major Contributor
> OpenNebula - The Open Source Toolkit for Cloud Computing
> www.OpenNebula.org | cmartin at opennebula.org
>
> On Thu, Sep 8, 2011 at 1:32 PM, samuel <samu60 at gmail.com> wrote:
>
> Thank you very much!
>
> Is it safe to manually change the code and just perform a ./install.sh from
> the sources on a running installation? I'm using Mysql backend so I expect
> that the modification of the sources will only affect the compillation of
> the modified library and the rest will continue working ok.
>
> Am I right?
>
> I really appreciate the fast response.
>
> Samuel Osorio.
>
>
>
> On 8 September 2011 12:20, Ruben S. Montero <rubensm at dacya.ucm.es> wrote:
>
> Hi,
>
> Yes you are right. There is an issue open [1]. We are planning to
> apply the proposed solution in that issue for 3.0 (i.e. clean-up will
> happen only when you issue a delete operation). I think this will
> address your use-case.
>
> [1] http://dev.opennebula.org/issues/265
>
> Thanks
>
> Ruben
>
> On Tue, Sep 6, 2011 at 5:28 PM, samuel <samu60 at gmail.com> wrote:
>> Hi folks,
>>
>> Recently there was a network problem and one instance became unreachable.
>> We
>> tried to restart it with stop and resume actions but there's been a
>> problem
>> and the disk has been deleted. The main concern is why, after trying to
>> restart and an error happened, the directory where the disk image resides
>> has been deleted? There was no sensible data on it but I just don't get
>> why
>> there has been a rm -rf of the directory.
>>
>> Details:
>>
>> The configuration is KVM with shared storage using open nebula 2.2.
>>
>> output of virsh version
>>     Compilado contra la biblioteca: libvir 0.8.8
>>     Utilizando la biblioteca: libvir 0.8.8
>>     Utilizando API: QEMU 0.8.8
>>     Ejecutando hypervisor: QEMU 0.14.0
>>
>> related logs:
>>
>> Tue Sep  6 12:37:49 2011 [VMM][D]: Message received: SAVE SUCCESS 22
>> Domain
>> one-22 saved to /srv/cloud/one/var//22/images/checkpoint
>> Tue Sep  6 12:37:49 2011 [VMM][D]: Message received:
>> Tue Sep  6 12:37:49 2011 [TM][D]: Message received: LOG - 22 tm_mv.sh:
>> Will
>> not move, is not saving image
>> Tue Sep  6 12:37:49 2011 [TM][D]: Message received: TRANSFER SUCCESS 22 -
>>
>> Tue Sep  6 12:38:12 2011 [DiM][D]: Restarting VM 22
>> Tue Sep  6 12:38:12 2011 [DiM][E]: Could not restart VM 22, wrong state.
>> Tue Sep  6 12:38:12 2011 [ReM][E]: Wrong state to perform action
>>
>> Tue Sep  6 12:38:18 2011 [ReM][D]: VirtualMachineAction invoked
>> Tue Sep  6 12:38:18 2011 [DiM][D]: Resuming VM 22
>> Tue Sep  6 12:38:47 2011 [DiM][D]: Deploying VM 22
>>
>> Tue Sep  6 12:38:47 2011 [ReM][D]: VirtualMachineInfo method invoked
>> Tue Sep  6 12:38:47 2011 [TM][D]: Message received: LOG - 22 tm_mv.sh:
>> Will
>> not move, is not saving image
>>
>> Tue Sep  6 12:38:47 2011 [TM][D]: Message received: TRANSFER SUCCESS 22 -
>>
>> Tue Sep  6 12:38:48 2011 [ReM][D]: VirtualMachineInfo method invoked
>> Tue Sep  6 12:38:49 2011 [VMM][D]: Message received: LOG - 22 Command
>> execution fail: 'if [ -x "/var/tmp/one/vmm/kvm/restore" ]; then
>> /var/tmp/one/vmm/kvm/restore /srv/cloud/one/var//22/images/checkpoint;
>> else                              exit 42; fi'
>> Tue Sep  6 12:38:49 2011 [VMM][D]: Message received: LOG - 22 STDERR
>> follows.
>> Tue Sep  6 12:38:49 2011 [VMM][D]: Message received: LOG - 22 error:
>> Failed
>> to restore domain from /srv/cloud/one/var//22/images/checkpoint
>> Tue Sep  6 12:38:49 2011 [VMM][D]: Message received: LOG - 22 error:
>> cannot
>> close file: Bad file descriptor
>> Tue Sep  6 12:38:49 2011 [VMM][D]: Message received: LOG - 22 ExitCode: 1
>> Tue Sep  6 12:38:49 2011 [VMM][D]: Message received: RESTORE FAILURE 22
>> error: Failed to restore domain from
>> /srv/cloud/one/var//22/images/checkpoint
>> Tue Sep  6 12:38:49 2011 [VMM][D]: Message received: error: cannot close
>> file: Bad file descriptor
>> Tue Sep  6 12:38:49 2011 [VMM][D]: Message received: ExitCode: 1
>>
>> Tue Sep  6 12:38:50 2011 [TM][D]: Message received: LOG - 22 tm_delete.sh:
>> Deleting /srv/cloud/one/var//22/images
>> Tue Sep  6 12:38:50 2011 [TM][D]: Message received: LOG - 22 tm_delete.sh:
>> Executed "rm -rf /srv/cloud/one/var//22/images".
>> Tue Sep  6 12:38:50 2011 [TM][D]: Message received: TRANSFER SUCCESS 22 -
>>
>>
>> Thank you in advance for any hint!
>> Samuel.
>>
>
>> _______________________________________________
>> Users mailing list
>> Users at lists.opennebula.org
>> http://lists.opennebula.org/listinfo.cgi/users-opennebula.org
>>
>>
>
>
>
> --
> Dr. Ruben Santiago Montero
> Associate Professor (Profesor Titular), Complutense University of Madrid
>
> URL: http://dsa-research.org/doku.php?id=people:ruben
> Weblog: http://blog.dsa-research.org/?author=7
>
>
>
> _______________________________________________
> Users mailing list
> Users at lists.opennebula.org
> http://lists.opennebula.org/listinfo.cgi/users-opennebula.org
>
>
>
> _______________________________________________
> Users mailing list
> Users at lists.opennebula.org
> http://lists.opennebula.org/listinfo.cgi/users-opennebula.org
>
>



-- 
Javier Fontan, Grid & Virtualization Technology Engineer/Researcher
DSA Research Group: http://dsa-research.org
Globus GridWay Metascheduler: http://www.GridWay.org
OpenNebula Virtual Infrastructure Engine: http://www.OpenNebula.org



More information about the Users mailing list