[one-users] datastore confusion

Matthew Patton mpatton at inforelay.com
Thu Sep 6 22:29:59 PDT 2012


The documentation is sure plentiful but it really needs a fact and  
consistency checker. The rampant use of '/var/lib/one/datastores' in  
particular needs to be excised everywhere and replaced with  
$DATASTORE_LOCATION because there is a huge difference between what the  
front-end is doing with it and what the Host is using it for, or so the  
documentation seems to imply.

If I may summarize my understandings.
It looks like ONE started out with every host (front-end included) having  
their own local storage in the guise of 'system, id=0' and under that  
model I/O for any given VM was limited only by the local host's spindle  
capacity and any other VMs likewise co-located.

When moving to a shared 'system' mountpoint the underlying storage gets  
hammered because every host and every guest that's alive is using it. The  
problem could be mitigated somewhat if the source image could be  
referenced indirectly via symlinks (does that work on VMware VMFS via  
RDM?). Or by using clusters and selectively overriding what datastore was  
marked as 'system'.  The upside was obviously the ability to do warm/hot  
migration between hosts.

Under the old way presumably the 'system' used "TM_MAD=ssh" and the  
front-end could (must?) be used as the repository of all non-running disk  
images. Yet all image operations are supposed to be carried out at the  
host, so

Q0: why was the front-end involved in storing anything under  
'.../datastores/0'? If it was storing "at rest" disk images because there  
was no other provider, then it should have been under datastores with  
id!=0.

Q1: Under the shared model, the front-end definitely doesn't need access  
to 'system' ever? The drawing and text disagree on  
"http://opennebula.org/documentation:rel3.6:system_ds"

BUG: Can we please fix 'onedatastore show <#>' such that "BASE PATH" to  
use the literal string '$DATASTORE_LOCATION' or the current value of the  
variable as 'oned' understands it to be (see /var/lib/one/config). Always  
returning '/var/lib/one/datastores/...' is wrong. Better yet the value  
should be auto-generated unless the user has hard-coded it. Currently it  
appears there is no way for me to force it to be an arbitrary value. This  
is particularly pertinent when dealing with VMware since VMFS are located  
at '/vmfs/volumes' and since there is no persistence, any hackery like  
creating '/var/datastores/<#>' isn't going to survive a reboot.

Q2: Why are disks images being "copied" (to mean symlinks I guess) when  
the datastore type is 'shared' or 'vmware' unless the disk type is  
'clone'? Just hit the source image directly wherever it is.

Q3: Can we dispense with this whole 'system' being mandatory let alone  
being at a fixed location? There is no reason why the datastore that  
contains the "at rest" image can't be used when the VM is running and also  
include the volatile and clone images. Of course that doesn't apply if the  
source is only reachable via SSH, can't withstand the IOPs, or is  
otherwise unsuitable. I also find the term 'system' misleading when it  
should be named something more like 'runtime'. May I suggest a datastore  
attribute "ALLOW_RUN=" or "RUNTIME_SAFE=yes|no" with the unspecified  
behavior being that of 'no' and thus do the copying?

Q4: What happens if there are multiple "SYSTEM=yes" datastores in the  
context of a cluster (including the special cluster 'none')? Why shouldn't  
the runtime datastore(s) also be a HOST attribute in addition to a cluster  
one ala "SYSTEM_DS = <id> [id ...]"? If not specified the scheduler would  
revert to the more general scope and pick one that has sufficient space.  
It is perfectly reasonable to have different 'system' datastore sets  
across hosts even in the same cluster; some may have extra disks, broken  
disks, whatever. Deployment shouldn't break and I shouldn't have to  
side-line a host because it isn't strictly identical to it's peers.

Q5: Are there plans to have a 'system' datastore of type iSCSI or LVM? It  
would only make sense if the source was of like type. Though actually a  
sparse file on a filesystem would work as a block device too so this is  
more about supporting BLOCK devices for 'system' use.

Q6: when is it safe to override variables like VM_DIR or DS_DIR? Is there  
an accepted methodology?

-- 
Cloud Services Architect, Senior System Administrator
InfoRelay Online Systems (www.inforelay.com)


More information about the Users mailing list