RE: SW RAID6 ?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Rosee Vandegrift wrote:
> [Marcel wrote?}
> > How about not adding a RAID6 driver, but extending the 
> RAID5 driver some 
> > to make the number of parity stripes a variable? I don't 
> know how much 
> > extension that would require - perhaps it would bog down the RAID5 
> > driver too much in single stripe use, so we still might be 
> better off 
> > with a separate RAID6(+) driver. But it looks like most of the code 
> > could simply be copied.
> 
> Hmmm, I think you've convinced me that, should we get 
> RAID6(+) support,
> I'd give it a good shot.  This sounds like something my boss would be
> all about...
> 
There's a bit more to RAID6 then adding a second parity block in a stripe.
Consider the following stripe:

D1 D2 D3 P1 P2

If P1 and P2 represent the same encoding of D1 + D2 + D3 (say, XOR like RAID
4/5) you cannot withstand the loss of any two drives.  If two of the data
drives go down, you are still toast.

You can withstand two data drives going down by not using the same set of
drives.  If P1 = D1+D2 and P2 = D2+D3, you can now withstand any two data
drives going down.  But if P1 and D1 go down together, you are still toast.

You can get parity sets of the appropriate size (and necessary non-multiple
intersection) by calculating parity vertically as well as horizontally.
Evenodd utilizes this with fixed parity drives, IIRC.  The illustration of
RAID6 at http://www.acnc.com/04_01_06.html implies this, though the
illustration they show will not work (losing the first and last drive would
lose data block A0 as well as parity A and parity 0.  Rule of thumb -- a
chunk that is a member of a two parity sets cannot have both parity blocks
on the same drive).  However, there ARE schemes that will work with simple
encoding.  The original RAID 6 implementation did not use simple encoding,
IIRC it used Huffman codes.

The reason it hasn't been used (I know of no commercial RAID6
implementations) is performance, but not because of the trivial parity
calculation time. RAID6 is very performance expensive for the same reason
that RAID 5 is performance expensive.  It's not the calculation of the
parity, it's the disk I/O for writes.  A small write to a RAID 5 can result
in two reads followed by two writes.  A small write to a RAID 6 can result
in three reads followed by three writes, and with vertical parity striping
the likelihood of a full-stripe write goes way down.

That said, I'd like to get around to making a RAID6 driver sometime.  I
think using small RAID6 chunk sizes and pulling in a full parity page on
each access might get respectable performance.  But it will be quite a bit
more hairy than the RAID 5 driver.

Dale Stephenson
steph@snapserver.ocm
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux