Re: Successful RAID 6 setup

Bill Davidsen <davidsen@xxxxxxx> · Mon, 09 Nov 2009 12:37:57 -0500

Beolach wrote:
On Sat, Nov 7, 2009 at 11:35, Doug Ledford <dledford@xxxxxxxxxx> wrote:

On 11/04/2009 01:40 PM, Leslie Rhorer wrote:

      I would recommend a larger chunk size.  I'm using 256K, and even
512K or 1024K probably would not be excessive.

OK, I've got some data that I'm not quite ready to send out yet, but it
maps out the relationship between max_sectors_kb (largest request size a
disk can process, which varies based upon scsi host adapter in question,
but for SATA adapters is capped at and defaults to 512KB max per
request) and chunk size for a raid0 array across 4 disks or 5 disks (I
could run other array sizes too, and that's part of what I'm waiting on
before sending the data out).  The point here being that a raid0 array
will show up more of the md/lower layer block device interactions where
as raid5/6 would muddy the waters with other stuff.  The results of the
tests I ran were pretty conclusive that the sweet spot for chunk size is
when chunk size is == max_sectors_kb, and since SATA is the predominant
thing today and it defaults to 512K, that gives a 512K chunk as the
sweet spot.  Given that the chunk size is generally about optimizing
block device operations at the command/queue level, it should transfer
directly to raid5/6 as well.

This only really applies for large sequential io loads, right?  I seem
to recall
smaller chunk sizes being more effective for smaller random io loads.

Not true now (if it ever was). The operative limit here is seek time, 
not transfer time. Back in the day of old and slow drives, hanging off 
old and slow connections, the time to transfer the data was somewhat of 
an issue. Current SATA drives and controllers have higher transfer 
rates, and until SSD make seek times smaller bigger is better within reason.

Related question: that said, why is a six drive raid6 slower than a four 
drive? On a small write all the data chunks have to be read, but that 
can be done in parallel, so the limit should stay at the seek time of 
the slowest drive. In practice it behaves as if the data chunks were 
being read one at a time. Is that real, or just fallout from not a long 
enough test to smooth out the data?

--
Bill Davidsen <davidsen@xxxxxxx>
 "We can't solve today's problems by using the same thinking we
  used in creating them." - Einstein

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html