[BBLISA] Fileserver opinion

Toby Burress kurin at delete.org
Fri Aug 13 14:56:49 EDT 2010


On Fri, Aug 13, 2010 at 02:42:12PM -0400, David Miller wrote:
> What does your zpool look like?  Ideally if you're using RAIDz or RAIDz2
> then you should be using multiple RAIDz sets in the pool.  This way IO is
> stripped across the RAIDz sets and any degradation, and recovery, should
> only involve the smaller RAIDz set.  Which should be relatively quick
> depending on the size and type of drives involved.

The server that it taking a billion years to resilver does in fact have
15 disks in one big raidz2 pool.  The other server has a single pool of
three raidz2 arrays of 8 disks each, so hopefully that will yield better
recoveries.  Although if the bottleneck is reads, then wouldn't it be
faster to read from 14 disks than 7?  And if the bottleneck is just
writes, then wow, I need to buy some different disks next time.

Since the load on the machine is 3, and it's doing nothing but
resilvering, I suspect the bottleneck is actually the CPU.  I don't
know a ton about the implementation of ZFS, but I do know it checksums
every block.  It would be insane for it not to verify those checksums
while resilvering, and perhaps it even recomputes them while writes them
to the new disk.

> 
> I just had to resliver a mirrored pool of 250GB drives that I have in my
> home file server.  It took about 5 hours for it to resliver.  But ZFS only
> reslivers the used space and not the entire drive like with hardware
> solutions.  So ZFS should rebuild faster than most other solutions.
> --
> David
> 
> On Fri, Aug 13, 2010 at 1:07 PM, Toby Burress <kurin at delete.org> wrote:
> 
> > On Fri, Aug 13, 2010 at 12:11:51PM -0400, Rob Taylor wrote:
> > > How active is that system and how big is the drive that's resilvering?
> >
> > Right now, except for the resilver, it's doing nothing.  For about 8
> > hours a day it's copying data from production servers.
> >
> >  replacing    DEGRADED     0     0 80.1M
> >    c4t11d0p0  UNAVAIL      0     0     0  cannot open
> >    c4t7d0p0   ONLINE       0     0     0  660G resilvered
> >
> >  backups                18T   13T  4.5T  75% /backups
> >
> > It's a Random Box of Parts with 16 1.5TB disks running OpenSolaris
> > (which I've come to discover is weird and strange and I dislike it).
> > Its sister server is FreeBSD with 24 2TB disks, and I haven't yet had
> > to replace a drive, but I'm hoping it won't take six days.
> >
> > _______________________________________________
> > bblisa mailing list
> > bblisa at bblisa.org
> > http://www.bblisa.org/mailman/listinfo/bblisa
> >



More information about the bblisa mailing list