[one-users] Copy image on Ceph to file-based datastore

Thu Mar 20 18:47:33 PDT 2014

Hi Jaime,
On 18/03/14 19:26, Jaime Melis wrote:
> before moving on to the implementation details, how are you thinking of
> specifying if a VM should run from the datastore or from a local file?
> 
> I'm afraid this is going to be very tricky, because we need to figure
> out how to tell the core to generate a deployment file that references a
> local file and not a ceph disk

I've been doing some more thinking about the problem: mainly using LVM
for provisioning the local storage rather than files, but also using
FlashCache to allow write-back/write-through cache for the back-end RBD.

With this set up, one might conceive three possible modes:
- Online write-through caching:
  - Deployment:
    - Image is provisioned as an RBD,mapped by the virtual machine
      host as a block device (/dev/rbdX), either the original image
      for persistent images, or a clone/copy of the original.
    - A region of local storage is provisioned from a pool (LVM)
    - FlashCache is configured in write-through mode to produce a
      cached block device.
    - OpenNebula configures the VM to use that cached block device
  - Clean-up:
    - FlashCache is shut down, LVM unprovisioned.
    - For non-persistent images: the RBD is deleted.
- Online write-back: same as above, but FlashCache is in write-back mode
- Offline write-back:
  - Deployment:
    - A region of cache is provisioned equal to the size of the image.
    - The source RBD is copied to the cache (using rbd export).  For
      non-persistent images, the instance's copy is used if it exists.
    - VM starts using the local storage
  - Clean-up/undeploy/migrate...etc.
    - The LVM cache is imported back into ceph, but with a -new
      extension added to the RBD image name.  (Needed, because
      the upload could be interrupted and 'rbd import' won't overwrite
      an existing image.)
    - When the upload is successful, the old RBD image is deleted
      and the new image is renamed to remove the -new prefix.

The concept could work for iSCSI and other backends too.  In the case
when a host doesn't have sufficient space, it might be acceptable to
fall-back to "online write-back" which could at least allow a portion to
be cached locally.

It'd be nice if in the machine template where you specify individual
disks, if you could state what caching mode to use.  Alternatives would
be to specify it in the image template or (least favourable) in the
datastore template.

I'm in the midst of other projects just at this time, but I plan to have
a good look at the LVM and Ceph datastore/transfer drivers to see if a
new datastore driver could be written to achieve this.

Regards,
-- 
Stuart Longland
Systems Engineer
     _ ___
\  /|_) |                           T: +61 7 3535 9619
 \/ | \ |     38b Douglas Street    F: +61 7 3535 9699
   SYSTEMS    Milton QLD 4064       http://www.vrt.com.au