[one-users] A resumption failure results in the deletion of images

Shi Jin jinzishuai at gmail.com
Tue Jun 15 15:08:48 PDT 2010


Hi there,

I recently had a very serious problem.
I called "onevm stop" on a VM to hiberate the VM into checkpoint file.
Then I tried to call "onevm resume" to bring it back online.
However, the resumption progress went wrong.
There can be several reasons for it to go wrong.
For example, libvirt would fail if there is another volume attached to it.
But this is not relevant to this thread (I am planning on starting a
new one on this soon).
The key point here is that, as soon as the restore fails, the
OpenNebula code triggers the DEPLOY_FAILURE LCM.
This can be found at src/vmm/VirtualMachineManagerDriver.cc
399     else if ( action == "RESTORE" )
400     {
401         Nebula              &ne  = Nebula::instance();
402         LifeCycleManager    *lcm = ne.get_lcm();
403
404         if (result == "SUCCESS")
405         {
406             lcm->trigger(LifeCycleManager::DEPLOY_SUCCESS, id);
407         }
408         else
409         {
410             string          info;
411
412             getline(is,info);
413
414             os.str("");
415             os << "Error restoring VM, " << info;
416
417             vm->log("VMM",Log::ERROR,os);
418
419             lcm->trigger(LifeCycleManager::DEPLOY_FAILURE, id);
420         }
421     }


The LCM would eventually delete the images directory and the user
would lost all the precious data he/she has obtained so far and there
is no way to get it back!

So I desperately need to prevent OpenNebula from deleting  the precious images.
A quick hack I did was to comment out the line 419 above so that the
LCM is not triggered at all. But I am sure this is not clean and we
need more than this.
I am thinking maybe one needs a way to separate a fresh booting VM and
a resumption VM. For now, they are no different to OpenNebula and are
both in the BOOT State.
So please let me know if what I reported is a bug and if this can be
fixed in the future.
I could submit this on the dev site as well.
Thank you very much.

Shi

-- 
Shi Jin, Ph.D.



More information about the Users mailing list