On Wed, Oct 28, 2020 at 6:39 PM Vitaly Mayatskih <v.mayatskih@xxxxxxxxx> wrote > >On Thu, Oct 22, 2020 at 2:56 AM Finlayson, James M CIV (USA) <james.m.finlayson4.civ@xxxxxxxx> wrote: >> >> All, >> I'm working on creating raid5 or raid6 arrays of 800K IOP nvme drives. Each of \ >> the drives performs well with a queue depth of 128 and I set to 1023 if allowed. \ >> In order for me to try to max out the queue depth on each RAID member, so I'd like \ >> to set the sysfs nr_requests on the md device to something greater than 128, like \ >> #raid members * 128. Even though /sys/block/md127/queue/nr_requests is mode 644, \ >> when I try to change nr_requests in any way as root, I get write error: invalid \ >> argument. When I'm hitting the md device with random reads, my nvme drives are \ >> 100% utilized, but only doing 160K IOPS because they have no queue depth. >> Am I doing something silly? > >It only works for blk-mq block devices. MD is not blk-mq. > >You can exchange simplicity for performance: instead of creating one >RAID-5/6 array you can partition drives in N equal sized partitions, >create N RAID-5/6 arrays using one partition from every disk, then >stripe them into top-level RAID-0. So that would be RAID-5+0 (or 6+0). > >It is awful, but simulates multiqueue and performs better in parallel >loads. Especially for writes (on RAID-5/6). > > >-- >wbr, Vitaly Vitaly, Thank you for the tip. My raid5 performance (after creating 32 partitions per SSD) and running 64 9+1 (2 in reality) stripes is up to 11.4M 4K random read IOPS, out of 17M that the box is capable, which I'm happy with, because I can't NUMA the raid stripes as I would the individual SSDs themselves. However, when I perform the RAID0 striping to make the "RAID50 from hell", my performance drops to 7.1M 4K random read IOPS. Any suggestions? The last RAID50, again won't let me generate the queue depth. Thanks in advance, Jim