[BBLISA] ZFS Anyone?

Sat Apr 4 00:18:47 EDT 2009

>     * Does anyone have experience with ZFS?  What has your experience
> been?

I am also new to ZFS, but I agree it's the greatest thing since the
transistor.

>     * Is it stable/reliable/etc.  Honestly, before this experience LVM2
>       has been rock solid for me across many deployments.

Speaking from everyone else's experience and not my own - Yes, it's totally
stable and solid.  There are a few differences in using it, that may catch
you off guard if it's new to you (like ... you have to install grub on each
of the disks in the mirror if you are booting from those disks, for example,
because only the partition is actually mirrored...)

>     * What are the worst/most annoying parts of dealing with ZFS?

Solaris.
I mean, seriously.  What kind of system STILL ships without backspace, or
the up arrow key by default.  And all these little annoying things, like
"there is no default bashrc" and "by default, root's home directory is /"
and "We still haven't adopted gnu tar."  But you can deal with all of those
things without too much trouble.

The solaris folks are quaking in their boots right now, for fear of IBM
takeover.

Why is ZFS constrained to Solaris?  Because the license terms are
incompatible with linux.  Free open source, yes.  Not legal to build into
the linux kernel.

Choose your backup system wisely.  Since it's so non-mainstream, it's
difficult to assess what's a good backup system.

>     * Why should I use it?

Snapshots make it so users always have a safety net, and they can 99% of the
time restore their own files without asking you to restore from tape.
Really good feature.

Add disks to expand your filesystem on the fly.  No dismount/resize/remount.
No slowdown.  Just do it.

Checksumming.  I love this one.  Whether you notice it or not, all disks
make mistakes undetected - accidentally writing a 1 when it meant to write a
0 or whatnot.  Typical mean time between failures might be 25,000 hours for
a single disk, but you've got 10 disks working as a team, which means your
MTBF is less than a year.  You'll unknowingly suffer from a disk error this
year.  And every year.  Better hope it doesn't cost you.

Enter checksumming ...  In the demo, they create a bunch of data on ZFS
disks, inject random bits into the disks, and everything is fine.  Behind
the scenes, blocklevel checksums are stored, and checked again during reads.
When one disk in the mirror has a bad checksum, and the other disk has a
good checksum, the system silently discards the corrupt data and rewrites
it.  It doesn't indicate a bad disk unless it happens too many times.