[RFC V2 1/2] md/raid5: optimize RAID5 performance.

Yiming Xu <teddyxym@xxxxxxxxxxx> · Wed, 3 Apr 2024 01:05:46 +0800

From: Shushu Yi <firnyee@xxxxxxxxx>

<changelog>

Optimized by using fine-grained locks, customized data structures, and
scattered address space. Achieves significant improvements in both
throughput and latency.

This patch attempts to maximize thread-level parallelism and reduce
CPU suspension time caused by lock contention. On a system with four
PCIe 4.0 SSDs, we achieved increased overall storage throughput by
89.4% and decreases the 99.99th percentile I/O latency by 85.4%.

Seeking feedback on the approach and any addition information regarding
Required performance testing before submitting a formal patch.

Note: this work has been published as a paper, and the URL is
(https://www.hotstorage.org/2022/camera-ready/hotstorage22-5/pdf/
hotstorage22-5.pdf)

Co-developed-by: Yiming Xu <teddyxym@xxxxxxxxxxx>
Signed-off-by: Yiming Xu <teddyxym@xxxxxxxxxxx>
Signed-off-by: Shushu Yi <firnyee@xxxxxxxxx>
Tested-by: Paul Luse <paul.e.luse@xxxxxxxxx>
---
V1 -> V2: Cleaned up coding style and divided into 2 patches (HemiRAID
and ScalaRAID corresponding to the paper mentioned above). This part is
HemiRAID, which increased the number of stripe locks to 128.

 drivers/md/raid5.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/md/raid5.h b/drivers/md/raid5.h
index 9b5a7dc3f2a0..d26da031d203 100644
--- a/drivers/md/raid5.h
+++ b/drivers/md/raid5.h
@@ -501,7 +501,7 @@ struct disk_info {
  * and creating that much locking depth can cause
  * problems.
  */
-#define NR_STRIPE_HASH_LOCKS 8
+#define NR_STRIPE_HASH_LOCKS 128
 #define STRIPE_HASH_LOCKS_MASK (NR_STRIPE_HASH_LOCKS - 1)
 
 struct r5worker {
-- 
2.34.1