[one-users] (thanks) Re: experiences with distributed FS?

Fri Feb 10 02:08:38 PST 2012

Hi,

On 02/09/2012 01:50 PM, richard -rw- weinberger wrote:
> On Thu, Feb 9, 2012 at 1:38 PM, João Pagaime<jpsp at fccn.pt>  wrote:
>> here's a short summary by FS:
>> • ----------------(RW)… you are using FUSE.
>
> No, I'm not using FUSE.
> My OpenNebula cluster is built on top of ocfs2.

Richard meant that the XtreemFS client implementation does use FUSE just 
as many other distributed file systems do.

Regarding the general FUSE performance discussion:

On 02/09/2012 11:49 AM, richard -rw- weinberger wrote:
[...]
 > Hmm, you are using FUSE.
 > Performance measurements would be really nice to have.
 >

Any suggestions how to conduct OpenNebula specific measurements are welcome.

On our mailing list I wrote about the write throughput performance of 
XtreemFS: http://groups.google.com/group/xtreemfs/msg/f5a70a1780d9f4f9

Write throughput is usually limited by the execution time of the write 
operation since the application on top does not issue the next write() 
before the previous did return. Therefore we also allow asynchronous 
writes which acknowledge a number of write()s to the application before 
they are actually confirmed by the storage server. To be on the safe 
side in that case, you have to execute fsync() and evaluate the return 
value of close(). As written in the post mentioned above, with 
asynchronous writes you are able to almost max out a GbE link (up to 
100MB/s write speed), but it also incurs a lot of overhead: I saw up to 
70% CPU usage for the XtreemFS client during that test.

 > Writing fancy file systems using FUSE is easy. Making them fast and 
scalable
 > is a damn hard job and often impossible.

I fully agree that kernel-based file systems in Linux will always have a 
lower overhead than its FUSE counterparts. But the overhead is mainly 
caused by the structure of the Linux kernel: all data read and written 
by FUSE file systems has to be copied between kernel space and user 
space. If this was optimized, the overhead would be much less significant.

In general, the overhead of a FUSE implementation is the cost of the 
"fanciness". If a required feature is only available in a FUSE based 
file system, you would rather use that than waiting for a never 
appearing kernel implementation. Writing a distributed file system in 
the kernel is a damn hard job and often impossible. Therefore FUSE file 
systems are an alternative.

The scalability (of a distributed file system) is independent of the 
choice of a kernel or userspace implementation. That's a matter of the 
design of the file system.

At the end it's up to the user. If there's a kernel based file system 
available which suits your needs, than you can use that (as in your 
case). If not, you're willing to pay the price for the overhead since 
the FUSE based file system has a lot more to offer.

Best regards,
Michael