[one-users] Some issues while using OpenNebula
Ruben S. Montero
rubensm at dacya.ucm.es
Tue Feb 10 07:06:42 PST 2009
Hi Boris,
Thank you very much for your clarification!.
> Ok ,that sounds good.
> The problem I got was, that a machine got shutdown via the xm command
> on the host itself. Than it does not appear in onevm list anymore and
> I did not know what happened. Also if a host crashes, ONE takes the vm
> out of the list, but then Xen itself restarts the machine after the
> host got rebooted and Nebula does not found this machine anymore. So
> it is up and running but not listed in onevm and of course a new
> submit via ONE on the same image does not work.
Ok, now I understand the problem. May be OpenNebula tries to be too much
cleaver here ;). We could add a new state, ERROR for a VM. Instead of current
behavior , i.e. guessing, we could leave the VM in the new ERROR state. In
this way it is listed, and you can take a decision. Additionally we could
monitor the VM periodically, to check if the hypervisor could recover from the
error.
>
> Hm.. I did not get this completely. I had a machine, that kept listed
> in the state "boot".
> I could not do any command on it (I tried onevm delete, onevm resume,
> onevm stop) so I had to kill the hung-up booting vm via xen on the
> selected host and resubmit the template (and still have the boot-entry
> in the onevm list).
Ok. I misunderstood the problem. Yes, we should be able to send a kill command
to the VM. I'll put this in the 1.4 tentative roadmap
> Actually I can not really reproduce it myself at the moment, since I
> don't want to shutdown the running machines.
> There were some configuration problems after the installation, so not
> everything was working.
> While trying to fix this, many failed ssh connections openNebula had
> opened to one of the nodes, kept open - and then the node did not
> response to ssh at all anymore and had to be manually rebooted.
>
OK. We are re-engineering the OpenNebula drivers, so we'll have an opportunity
to look at this one.
>
> Sure, please find the files attached.
> (OpenNebula Server: wn001, used host wn002)
> And yes, egrep is installed.
Thanks!!
>
> So when I resume the machine via onevm, it will use the clean initial
> image instead of the cloned one, it was working on before?
> Is this also true, if I use stop instead of shutdown?
> For example if I run a svn server in a virtual machine, I intend to
> have any changes saved back to the image?
It should use the saved image along with the checkpoint file to restore the VM,
when you use stop/suspend. However, if you shutdown a VM, and you want to keep
the changes you have two options:
* Not cloning, If you are using a shared storage, you may work directly over
the image. However you can not reuse it for other VMs.
* SAVE=yes for the DISK. If you do not have a shared image repo, or if you
want to clone the images. OpenNebula will copy the disk back to the
$ONE_LOCATION/var/$VMID directory, and it will NOT overwrite the original
disk, so you have to move it yourself.
Thanks again for your valuable feedback!!
Cheers
Ruben
>
> > Ok. May be we should improve the doc here:
>
> Thanks a lot for the clarification, it was really helpful!
>
> Kind regards,
> Boris
--
+---------------------------------------------------------------+
Dr. Ruben Santiago Montero
Associate Professor
Distributed System Architecture Group (http://dsa-research.org)
URL: http://dsa-research.org/doku.php?id=people:ruben
Weblog: http://blog.dsa-research.org/?author=7
GridWay, http://www.gridway.org
OpenNEbula, http://www.opennebula.org
+---------------------------------------------------------------+
More information about the Users
mailing list