On Wed, Feb 1, 2023 at 4:51 PM Hou Tao <houtao@xxxxxxxxxxxxxxx> wrote: > > Hi, > > On 2/2/2023 12:35 AM, Song Liu wrote: > > On Mon, Jan 30, 2023 at 10:39 PM Hou Tao <houtao@xxxxxxxxxxxxxxx> wrote: > >> From: Hou Tao <houtao1@xxxxxxxxxx> > >> > >> Don't update recovery_cp when curr_resync is MD_RESYNC_ACTIVE, otherwise > >> md may skip the resync of the first 3 sectors if the resync procedure is > >> interrupted before the first calling of ->sync_request() as shown below: > >> > >> md_do_sync thread control thread > >> // setup resync > >> mddev->recovery_cp = 0 > >> j = 0 > >> mddev->curr_resync = MD_RESYNC_ACTIVE > >> > >> // e.g., set array as idle > >> set_bit(MD_RECOVERY_INTR, &&mddev_recovery) > >> // resync loop > >> // check INTR before calling sync_request > >> !test_bit(MD_RECOVERY_INTR, &mddev->recovery > >> > >> // resync interrupted > >> // update recovery_cp from 0 to 3 > >> // the resync of three 3 sectors will be skipped > >> mddev->recovery_cp = 3 > >> > >> Fixes: eac58d08d493 ("md: Use enum for overloaded magic numbers used by mddev->curr_resync") > >> Signed-off-by: Hou Tao <houtao1@xxxxxxxxxx> > > By the way, how did you find this issue? Is it from users/production? > > Or just from reading the code? > Found the issue when reading the code and reproduced the problem to confirm that. Thanks for the information! Song