On Wed, Dec 6, 2023 at 6:08 PM Yu Kuai <yukuai1@xxxxxxxxxxxxxxx> wrote: > > From: Yu Kuai <yukuai3@xxxxxxxxxx> > > New mddev_resume() calls are added to synchronize IO with array > reconfiguration, however, this introduces a performance regression while > adding it in md_start_sync(): > > 1) someone sets MD_RECOVERY_NEEDED first; > 2) daemon thread grabs reconfig_mutex, then clears MD_RECOVERY_NEEDED and > queues a new sync work; > 3) daemon thread releases reconfig_mutex; > 4) in md_start_sync > a) check that there are spares that can be added/removed, then suspend > the array; > b) remove_and_add_spares may not be called, or called without really > add/remove spares; > c) resume the array, then set MD_RECOVERY_NEEDED again! > > Loop between 2 - 4, then mddev_suspend() will be called quite often, for > consequence, normal IO will be quite slow. > > Fix this problem by don't set MD_RECOVERY_NEEDED again in md_start_sync(), > hence the loop will be broken. > > Fixes: bc08041b32ab ("md: suspend array in md_start_sync() if array need reconfiguration") > Suggested-by: Song Liu <song@xxxxxxxxxx> > Reported-by: Janpieter Sollie <janpieter.sollie@xxxxxxxxx> > Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218200 > Signed-off-by: Yu Kuai <yukuai3@xxxxxxxxxx> Thanks for the fix! I added a comment and applied it to md-fixes. Song