From: Shushu Yi <firnyee@xxxxxxxxx> <changelog> Optimized by using fine-grained locks, customized data structures, and scattered address space. Achieves significant improvements in both throughput and latency. This patch attempts to maximize thread-level parallelism and reduce CPU suspension time caused by lock contention. On a system with four PCIe 4.0 SSDs, we achieved increased overall storage throughput by 89.4% and decreases the 99.99th percentile I/O latency by 85.4%. Seeking feedback on the approach and any addition information regarding Required performance testing before submitting a formal patch. Note: this work has been published as a paper, and the URL is (https://www.hotstorage.org/2022/camera-ready/hotstorage22-5/pdf/ hotstorage22-5.pdf) Co-developed-by: Yiming Xu <teddyxym@xxxxxxxxxxx> Signed-off-by: Yiming Xu <teddyxym@xxxxxxxxxxx> Signed-off-by: Shushu Yi <firnyee@xxxxxxxxx> Tested-by: Paul Luse <paul.e.luse@xxxxxxxxx> --- V1 -> V2: Cleaned up coding style and divided into 2 patches (HemiRAID and ScalaRAID corresponding to the paper mentioned above). This part is HemiRAID, which increased the number of stripe locks to 128. drivers/md/raid5.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/md/raid5.h b/drivers/md/raid5.h index 9b5a7dc3f2a0..d26da031d203 100644 --- a/drivers/md/raid5.h +++ b/drivers/md/raid5.h @@ -501,7 +501,7 @@ struct disk_info { * and creating that much locking depth can cause * problems. */ -#define NR_STRIPE_HASH_LOCKS 8 +#define NR_STRIPE_HASH_LOCKS 128 #define STRIPE_HASH_LOCKS_MASK (NR_STRIPE_HASH_LOCKS - 1) struct r5worker { -- 2.34.1