The wait_for_overlap wait queue is currently used in two cases, which are not really related: - waiting for actual overlapping bios, which uses R5_Overlap bit, - waiting for events related to reshape. Handling every write request in raid5_make_request() involves adding to and removing from this wait queue, which uses a spinlock. With fast storage and multiple submitting threads the contention on this lock is noticeable. This patch series aims to resolve this by separating the two cases mentioned above and using this wait queue only when reshape is in progress. The results when testing 4k random writes on raid5 with null_blk (8 jobs, qd=64, group_thread_cnt=8): before: 463k IOPS after: 523k IOPS The improvement is not huge with this series alone but it is just one of the bottlenecks. When applied onto some other changes I'm working on, it allowed to go from 845k IOPS to 975k IOPS on the same test. Artur Paszkiewicz (3): md/raid5: use wait_on_bit() for R5_Overlap md/raid5: only add to wait queue if reshape is in progress md/raid5: rename wait_for_overlap to wait_for_reshape drivers/md/raid5-cache.c | 6 +-- drivers/md/raid5.c | 95 +++++++++++++++++++++------------------- drivers/md/raid5.h | 2 +- 3 files changed, 52 insertions(+), 51 deletions(-) -- 2.43.0