[BBLISA] Fileserver opinion

Ian Levesque ian at crystal.harvard.edu
Wed Aug 11 23:37:04 EDT 2010


Hi Ian,

> 1. Should we consider running a VM on this same server and host e.g. the
> web server on a VM which accesses files through the virtualization
> layer, rather than a physical network interconnect.

If the file server is central storage for other systems on the network, you should configure your NAS to be as simple a file server as possible. That is, if this storage is of any importance, minimize the number of additional tasks it will perform. This way, there are fewer chances that it'll crash / run out of resources / be bogged down when the web server get bombed. Serve the VM its data via iSCSI if you want low overhead storage that can be as portable as the VM.


> 2. What combination of network filesystem and local file system
> combination makes sense? (currently NFS + ext4 is on the cards)

If you're serving data to a linux cluster, you'll want to use NFS. As for the local filesystem, that largely depends on the workload. I've been using XFS for years now and am quite pleased with its stability and general performance. It's possible to fine-tune it to align stripes with your RAID configuration. That said, XFS was designed with large files in mind. Try a few out; benchmark them each and stick with the one that works the best. I'm still hesitant to recommend EXT4 for any critical systems.


> 3. Should we consider alternatives to GigE for interconnect.

Depends on your workload and budget. But it sounds like your budget is dictating GigE.


> 4. How can we estimate our IOPs and throughput requirements?

A tough one, as others have mentioned. Do you imagine those 100 cores [reading|writing] [large|small] files from the server at gigabit-speed, simultaneously? And if so, what's an acceptable rate of data transfer for each job? What other infrastructure are you supporting that have network storage requirements besides the cluster and web server?


> 5. Perspectives on SLC SSDs vs. SAS2 w/ 15k drives, since we could
> probably transfer the 11x300 GB SAS2 drive budget to a collection of
> SSDs and live with the reduced storage if that was expected to have a
> big performance benefit.

SSDs give you absurdly fast random reads/writes, lower access time/latency and sometimes higher throughput. They're cooler to run and consume less power. That said, they're much more expensive per GB than spinning disks and have less max capacity, especially for the SLCs. http://en.wikipedia.org/wiki/Solid-state_drive#Comparison_of_SSD_with_hard_disk_drives

It sounds like you could use a local sysadmin's guidance. I seem to remember you mentioning in another email that you have some around your office?

Best,
Ian



More information about the bblisa mailing list