Re: best base / worst case RAID 5,6 write speeds

Doug Dumitru <doug@xxxxxxxxxx> · Mon, 21 Dec 2015 22:15:34 -0800

My apologies for diving in so late.

I routinely run 24 drive raid-5 sets with SSDs.  Chunk is set at 32K
and the applications only writes "perfect" 736K "stripes".  The SSDs
are Samsung 850 pros on dedicated LSI 3008 SAS ports and are at "new"
preconditioning (ie, they are at full speed) or just over 500 MB/sec.
CPU is a single E5-1650 v3.

With stock RAID-5 code, I get about 1.8 GB/sec, q=4.

Now this application is writing from kernel space
(generic_make_request w/ q waiting for completion callback).  There
are a lot of RMW operations happening here.  I think the raid-5
background thread is waking up asynchronously when only a part of the
write has been buffered into stripe cache pages.  The bio going into
the raid layer is a single bio, so nothing is being carved up on the
request end.  The raid-5 helper thread also saturates a cpu core
(which is about as fast as you can get with an E5-1650).

If I patch raid5.ko with special case code to avoid the stripe cache
and just compute parity and go, the write throughput goes up above
11GB/sec.

This is obviously an impossible IO pattern for most applications, but
does confirm that the upper limit of (n-1)*bw is "possible", but not
with the current stripe cache logic in the raid layer.

Doug Dumitru
WildFire Storage
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html