Re: [PATCH] md: don't update recovery_cp when curr_resync is ACTIVE

Song Liu <song@xxxxxxxxxx> · Wed, 1 Feb 2023 22:53:36 -0800

On Wed, Feb 1, 2023 at 4:51 PM Hou Tao <houtao@xxxxxxxxxxxxxxx> wrote:
>
> Hi,
>
> On 2/2/2023 12:35 AM, Song Liu wrote:
> > On Mon, Jan 30, 2023 at 10:39 PM Hou Tao <houtao@xxxxxxxxxxxxxxx> wrote:
> >> From: Hou Tao <houtao1@xxxxxxxxxx>
> >>
> >> Don't update recovery_cp when curr_resync is MD_RESYNC_ACTIVE, otherwise
> >> md may skip the resync of the first 3 sectors if the resync procedure is
> >> interrupted before the first calling of ->sync_request() as shown below:
> >>
> >> md_do_sync thread          control thread
> >>   // setup resync
> >>   mddev->recovery_cp = 0
> >>   j = 0
> >>   mddev->curr_resync = MD_RESYNC_ACTIVE
> >>
> >>                              // e.g., set array as idle
> >>                              set_bit(MD_RECOVERY_INTR, &&mddev_recovery)
> >>   // resync loop
> >>   // check INTR before calling sync_request
> >>   !test_bit(MD_RECOVERY_INTR, &mddev->recovery
> >>
> >>   // resync interrupted
> >>   // update recovery_cp from 0 to 3
> >>   // the resync of three 3 sectors will be skipped
> >>   mddev->recovery_cp = 3
> >>
> >> Fixes: eac58d08d493 ("md: Use enum for overloaded magic numbers used by mddev->curr_resync")
> >> Signed-off-by: Hou Tao <houtao1@xxxxxxxxxx>
> > By the way, how did you find this issue? Is it from users/production?
> > Or just from reading the code?
> Found the issue when reading the code and reproduced the problem to confirm that.

Thanks for the information!

Song