RE: best base / worst case RAID 5,6 write speeds

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hey Doug,

I would be interested in seeing the patch you're talking about.  I wonder if that code couldn't be turned on/off with a tuning parameter or module param.

Bob Kierski
Senior Storage Performance Engineer
Cray Inc.
380 Jackson Street
Suite 210
St. Paul, MN 55101
Tele: 651-967-9590
Fax:  651-605-9001
Cell: 651-890-7461


-----Original Message-----
From: linux-raid-owner@xxxxxxxxxxxxxxx [mailto:linux-raid-owner@xxxxxxxxxxxxxxx] On Behalf Of Doug Dumitru
Sent: Tuesday, December 22, 2015 12:16 AM
Cc: Linux-RAID
Subject: Re: best base / worst case RAID 5,6 write speeds

My apologies for diving in so late.

I routinely run 24 drive raid-5 sets with SSDs.  Chunk is set at 32K and the applications only writes "perfect" 736K "stripes".  The SSDs are Samsung 850 pros on dedicated LSI 3008 SAS ports and are at "new"
preconditioning (ie, they are at full speed) or just over 500 MB/sec.
CPU is a single E5-1650 v3.

With stock RAID-5 code, I get about 1.8 GB/sec, q=4.

Now this application is writing from kernel space (generic_make_request w/ q waiting for completion callback).  There are a lot of RMW operations happening here.  I think the raid-5 background thread is waking up asynchronously when only a part of the write has been buffered into stripe cache pages.  The bio going into the raid layer is a single bio, so nothing is being carved up on the request end.  The raid-5 helper thread also saturates a cpu core (which is about as fast as you can get with an E5-1650).

If I patch raid5.ko with special case code to avoid the stripe cache and just compute parity and go, the write throughput goes up above 11GB/sec.

This is obviously an impossible IO pattern for most applications, but does confirm that the upper limit of (n-1)*bw is "possible", but not with the current stripe cache logic in the raid layer.

Doug Dumitru
WildFire Storage
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at  http://vger.kernel.org/majordomo-info.html
��.n��������+%������w��{.n�����{����w��ܨ}���Ơz�j:+v�����w����ޙ��&�)ߡ�a����z�ޗ���ݢj��w�f




[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux