On Thu, Feb 29, 2024 at 2:03 AM Yu Kuai <yukuai1@xxxxxxxxxxxxxxx> wrote: > > From: Yu Kuai <yukuai3@xxxxxxxxxx> > > Changes in v4: > - fix a problem in v2, that replacement rdev->raid_disk is set to > raid_disk + conf->mirros, which will cause test 01replace to run > forever, and mdadm tests looks good now(no new regression); > Changes in v3: > - add patch 2, and fix that setup_conf() is missing in patch3; > - add some review tag from Xiao Ni(other than patch 2,3); > Changes in v2: > - add new conter in conf for patch 2; > - fix the case choose next idle while there is no other idle disk in > patch 3; > - add some review tag from Xiao Ni for patch 1, 4-8 > > The original idea is that Paul want to optimize raid1 read > performance([1]), however, we think that the original code for > read_balance() is quite complex, and we don't want to add more > complexity. Hence we decide to refactor read_balance() first, to make > code cleaner and easier for follow up. > > Before this patchset, read_balance() has many local variables and many > branches, it want to consider all the scenarios in one iteration. The > idea of this patch is to divide them into 4 different steps: > > 1) If resync is in progress, find the first usable disk, patch 5; > Otherwise: > 2) Loop through all disks and skipping slow disks and disks with bad > blocks, choose the best disk, patch 10. If no disk is found: > 3) Look for disks with bad blocks and choose the one with most number of > sectors, patch 8. If no disk is found: > 4) Choose first found slow disk with no bad blocks, or slow disk with > most number of sectors, patch 7. > > Note that step 3) and step 4) are super code path, and performance > should not be considered. > > And after this patchset, we'll continue to optimize read_balance for > step 2), specifically how to choose the best rdev to read. > > [1] https://lore.kernel.org/all/20240102125115.129261-1-paul.e.luse@xxxxxxxxxxxxxxx/ > > Yu Kuai (11): > md: add a new helper rdev_has_badblock() > md/raid1: factor out helpers to add rdev to conf > md/raid1: record nonrot rdevs while adding/removing rdevs to conf > md/raid1: fix choose next idle in read_balance() > md/raid1-10: add a helper raid1_check_read_range() > md/raid1-10: factor out a new helper raid1_should_read_first() > md/raid1: factor out read_first_rdev() from read_balance() > md/raid1: factor out choose_slow_rdev() from read_balance() > md/raid1: factor out choose_bb_rdev() from read_balance() > md/raid1: factor out the code to manage sequential IO > md/raid1: factor out helpers to choose the best rdev from > read_balance() Applied v4 of the set to md-6.9 branch. Thanks, Song