[BBLISA] ZFS Anyone?

Edward Ned Harvey bblisa3 at nedharvey.com
Sun Apr 5 10:43:21 EDT 2009


> wouldn't a concatenation of mirrors imply that you write sequentially

> to a mirror until full before using the next mirror?  where's the

> benefit in that?  io to my raid10 is striped thank you very much.

 

I'm so glad you asked.  :-)  I've done a bunch of work on this recently,
specifically as it applies to ZFS.  Some of this info I thought was common
knowledge and have been surprised to discover really isn't.  So forgive me
if the following is too basic at first - then you can just skip to the juicy
stuff at the end.

 

Originally, hard drives were addressed by Cyllinder, Head, Sector.  CHS.
Before long, they invented the concept of Logical Block Address, LBA, in
which the physical geometry of the disk doesn't matter; each block gets a
sequential logical address.  So the drive didn't need to be thought of as a
drive anymore; it could be thought of as simply a random access data stream.

 

Then they invented all sorts of different types of RAID, in particular I'm
talking about Striping and Concatenation.  

 

In Striping, the logical blocks are distributed amongst a bunch of equally
sized disks, as follows:

      Disk0 Disk1 Disk2 Disk3

      LB0   LB1   LB2   LB3

      LB4   LB5   LB6   LB7   

      ... And so on.

This way, if you have a large sequential read or write, each of the disks
can contribute to the data stream, and you get n-times the sustainable
throughput of a single drive.

 

In Concatenation, the logical blocks sequentially fill up each disk before
going into the next disk.  Suppose each disk has a size of 100 blocks
(unrealistic, but easy for me to diagram), as follows:

      Disk0 Disk1 Disk2 Disk3

      LB0   LB100 LB200 LB300

      ...   ...   ...   ...

      LB99  LB199 LB299 LB399

 

One more piece of background information:  Gone are the days when
filesystems would attempt to position all the files close together at the
beginning of the physical device.  For the last several years, all modern
filesystems distribute the files throughout the physical device, to attempt
maximizing empty space between files and minimize file internal
fragmentation.

 

Now - Some of the fundamental differences between Striping and
Concatenation:

1.  In Striping, the disk sizes must match each other.  

2.  In Striping, you cannot expand by adding disks.

3.  In Striping, you get a performance benefit for large sequential IO by
distributing the physical blocks across multiple devices.

4.  In Striping, you get a performance reduction for files which are larger
than 1 block and smaller than n-blocks, because the time to read 1 block is
smaller than the time to seek the head to the block.

5.  In Concatenation, you can expand as you wish, using new disks of any
size.

6.  In Concatenation, you get a performance benefit for small files, because
the files are scattered about within the filesystem, so you can
simultaneously allow n-disks to be seeking.

 

And now the juicy stuff - 

 

If you have a filesystem such as ZFS, which is aware that it is using a
concatenated disk set, the filesystem itself can gain the large sequential
IO benefits of striping, by fragmenting the file and writing sequential
blocks of the file to non-sequential logical blocks in such a way that the
logical blocks are distributed among multiple physical devices.  In this
way, you can use a concatenation set, with an intelligent filesystem, to
gain all the benefits of both striping and concatenation:

1.  The disk sizes don't need to match each other

2.  You can add any size disk to an existing volume

3.  You have optimal performance for small files

4.  You have optimal performance for large files

For this reason, ZFS does not offer striping as an option.  Many people
mistakenly say "striping" when talking about ZFS concatenation.  It's a
common misnomer.

 

The only ingredient which is missing is redundancy.  The way to solve this
while maintaining optimal performance is to use a concatenation of mirrors.


 

Comparing a concatenation of mirrors versus raidz or raidz2 (assuming you
have a hotspare):

1.  Concatenation of mirrors achieves optimal performance using 2n+1 disks

2.  Raidz achieves somewhat less performance using n+2 disks

3.  Raidz2 achieves still lower performance using n+3 disks

Using the concatenation of mirrors, you get maximum performance and
flexibility, but you must buy a higher number of disks.  The only reason to
use raidz or raidz2 instead is to limit the number of disks required.

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.bblisa.org/pipermail/bblisa/attachments/20090405/ca47a943/attachment.htm 


More information about the bblisa mailing list