[BBLISA] Fileserver opinion

Wed Aug 11 15:48:36 EDT 2010

On Wed, Aug 11, 2010 at 3:06 PM, Bill Bogstad <bogstad at pobox.com> wrote:

> On Wed, Aug 11, 2010 at 1:55 PM, Ian Stokes-Rees
> <ijstokes at crystal.harvard.edu> wrote:
> >
> > Diligent readers will recall the thread a few weeks ago on slow disk
> > performance with a PATA XRaid system from Apple (HFS, RAID5).  Having
> > evaluated the situation, we're looking to get a new file server that
> > combines some fast disk with some bulk storage.  We have a busy web
> > server that is mostly occupied with serving static content (read only
> > access), some dynamic content (Django portal with mod_python/httpd), and
> > then scientific compute users who do lots of writes (including a 100
> > core cluster).
> >...
> >[LOTS of details about hardware ideas, etc.]
> >...
> >
> > 4. How can we estimate our IOPs and throughput requirements?
>
> I think this is THE most important question.   All the other answers
> are completely dependent on this one.  You need to attach specific
> numbers (with error bars) to current usage as well as estimate future
> changes.  "busy web server", "some dynamic content", "lots of writes"
> are based on experience/context.
>
> I wish I could give you specific suggestions on tools to gather this
> information, but that's going to be very dependent on your situation.
>  One generic thing, I would suggest is to analyze the log files
> for your web server.   You want to get an idea on what the "working
> set" size is for the web site.  If the number is small enough you
> should consider memory caching or possibly SSDs in the web server
> itself rather then doing something on the file server.
>
> Good Luck,
> Bill Bogstad
>

I agree with Bill here.  Knowing what your workloads require is the first
question that needs to be answered when trying to spec a solution like this.
 The only real way to get some ideas here is to look at historical
information if you have it.  Either via RRD graphs from a tool like Cacti,
Munin, or Zenoss or from a data collector like sar.

If you don't have historical data to look at you can use iostat, from the
sysstat package, to find out how much data each device has read and written
since the last reboot.  You can use those values to at least estimate your
read/write ratio.  You can also run iostat or sar for a while to gather some
shorter term min, max, and average figures.

Figuring out how many IOPs your workloads require is a tougher nut to crack.
 Especially if your existing environment has severe bottle necks.  These
bottle necks could exist in the network, storage, cpu, etc.  So historical
data is helpful here as at least you can find averages, max, and min values
over a given time frame.  And hopefully identify your existing bottle necks
clearly so that you know that you're attacking the right problem.
--
David
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.bblisa.org/pipermail/bblisa/attachments/20100811/1b8cfef4/attachment.htm