[BBLISA] shared network disks - vs gfs - vs distributed filesystem - vs ...

Wed Jul 1 17:16:59 EDT 2009

What follows are the ramblings of a bored person on a train with an  
iPhone and an hour to kill. Probably you should filter this.

In the past I have built shared filesystems using drbd to duplicate a  
block device then running ocfs2 as the filesystem. This worked well  
for building a redundant Xen setup with two servers, but probably  
wouldn't scale well.

If you mounted your scratch at /local/nodename-scratch and let every  
node export this via nfs, then all nodes could mount each others local  
scratch while preserving thier individuality as well as making  
contents widely availble to noncompute hosts if useful to do so.

Pvfs is a good option for this if you want the space all combined into  
a single volume. Could possibly see some improved performance  
depending on the shape of your I/o and compute load. Simply because of  
mtbf of drives this solution has robustness inversely proportional to  
node count unless you layer in some approach to disk redundancy. (drbd  
between node pairs with some sort of failover would be a fun way to  
spend all you free time configuring)

Lustre is an option but is purported to be a management pain to keep  
running. Similar to pvfs otherwise.

Glusterfs also looks very interesting for this type of work but I've  
not scratched the propoganda deep enough to understand how it works  
under the hood.

I'd love to hear about other options and what you eventually get  
working.

jbh

Sent from my iPhone

On Jul 1, 2009, at 2:15 PM, Edward Ned Harvey <bblisa3 at nedharvey.com>  
wrote:

> I have a bunch of compute servers.  They all have local disks  
> mounted as /scratch to use for computation scratch space.  This  
> ensures maximum performance on all systems, and no competition for a  
> shared resource during crunch time.  At present, all of their / 
> scratch directories are local, separate and distinct.  I think it  
> would be awesome if /scratch looked the same on all systems.  Does  
> anyone know of a way to “unify” this storage, without  
> compromising performance?  Of course, if some files reside on server 
>  A, and they are requested from server B, then the files must go acr 
> oss the network, but I don’t want the files to go across the network 
>  unless they are requested.  And yet, if you do something like “ls / 
> scratch” you would ideally get the same results regardless of which  
> machine you’re on.
>
>
>
> Due to the nature of heavy runtime IO (read, seek, write, repeat…) i 
> t’s not well suited to NFS or any network filesystem…  Due to the  
> nature of many systems all doing the same thing at the same time, it 
> ’s not well suited to a SAN using shared disks…
>
>
>
> I looked at gfs (the cluster filesystem) – but – it seems gfs  
> assumes a shared disk (like a san) in which case there is competitio 
> n for a shared resource.
>
>
>
> I looked at gfs (the google filesystem) – but – it seems they  
> constantly push all the data across the network, which is good for r 
> edundancy and mostly-just-read operations, and not good for heavy co 
> mputation IO.
>
>
>
> Not sure what else I should look at.  Any ideas?
>
>
>
> TIA.
>
> _______________________________________________
> bblisa mailing list
> bblisa at bblisa.org
> http://www.bblisa.org/mailman/listinfo/bblisa