From: Yu Kuai <yukuai3@xxxxxxxxxx> Changes in v4: - fix a problem in v2, that replacement rdev->raid_disk is set to raid_disk + conf->mirros, which will cause test 01replace to run forever, and mdadm tests looks good now(no new regression); Changes in v3: - add patch 2, and fix that setup_conf() is missing in patch3; - add some review tag from Xiao Ni(other than patch 2,3); Changes in v2: - add new conter in conf for patch 2; - fix the case choose next idle while there is no other idle disk in patch 3; - add some review tag from Xiao Ni for patch 1, 4-8 The original idea is that Paul want to optimize raid1 read performance([1]), however, we think that the original code for read_balance() is quite complex, and we don't want to add more complexity. Hence we decide to refactor read_balance() first, to make code cleaner and easier for follow up. Before this patchset, read_balance() has many local variables and many branches, it want to consider all the scenarios in one iteration. The idea of this patch is to divide them into 4 different steps: 1) If resync is in progress, find the first usable disk, patch 5; Otherwise: 2) Loop through all disks and skipping slow disks and disks with bad blocks, choose the best disk, patch 10. If no disk is found: 3) Look for disks with bad blocks and choose the one with most number of sectors, patch 8. If no disk is found: 4) Choose first found slow disk with no bad blocks, or slow disk with most number of sectors, patch 7. Note that step 3) and step 4) are super code path, and performance should not be considered. And after this patchset, we'll continue to optimize read_balance for step 2), specifically how to choose the best rdev to read. [1] https://lore.kernel.org/all/20240102125115.129261-1-paul.e.luse@xxxxxxxxxxxxxxx/ Yu Kuai (11): md: add a new helper rdev_has_badblock() md/raid1: factor out helpers to add rdev to conf md/raid1: record nonrot rdevs while adding/removing rdevs to conf md/raid1: fix choose next idle in read_balance() md/raid1-10: add a helper raid1_check_read_range() md/raid1-10: factor out a new helper raid1_should_read_first() md/raid1: factor out read_first_rdev() from read_balance() md/raid1: factor out choose_slow_rdev() from read_balance() md/raid1: factor out choose_bb_rdev() from read_balance() md/raid1: factor out the code to manage sequential IO md/raid1: factor out helpers to choose the best rdev from read_balance() drivers/md/md.h | 11 + drivers/md/raid1-10.c | 69 ++++++ drivers/md/raid1.c | 550 +++++++++++++++++++++++++----------------- drivers/md/raid1.h | 1 + drivers/md/raid10.c | 58 ++--- drivers/md/raid5.c | 35 +-- 6 files changed, 444 insertions(+), 280 deletions(-) -- 2.39.2