Re: [PATCH V2] md/raid1: don't clear bitmap bits on interrupted recovery.

Song Liu <liu.song.a23@xxxxxxxxx> · Tue, 12 Feb 2019 11:04:38 -0800

On Tue, Feb 12, 2019 at 7:07 AM Jinpu Wang <jinpu.wang@xxxxxxxxxxxxxxx> wrote:
>
> On Mon, Feb 11, 2019 at 9:36 PM Song Liu <liu.song.a23@xxxxxxxxx> wrote:
> >
> > On Mon, Feb 11, 2019 at 1:11 AM Jinpu Wang <jinpu.wang@xxxxxxxxxxxxxxx> wrote:
> > >
> > > >
> > > > sync_request_write no longer submits writes to a Faulty device. This has
> > > > the unfortunate side effect that bitmap bits can be incorrectly cleared
> > > > if a recovery is interrupted (previously, end_sync_write would have
> > > > prevented this). This means the next recovery may not copy everything
> > > > it should, potentially corrupting data.
> > > >
> > > > Add a function for doing the proper md_bitmap_end_sync, called from
> > > > end_sync_write and the Faulty case in sync_request_write.
> > > >
> > > > Fixes: 0c9d5b127f69 ("md/raid1: avoid reusing a resync bio after error
> > > > handling.")
> > > > Signed-off-by: Nate Dailey <nate.dailey@xxxxxxxxxxx>
> >
> > Thanks Jinpu! I will take care of this patch and cc stable.
> >
> > Would you like to add your Reviewed-by/Tested-by/Acked-by?
> >
> > Song
> Hi Song,
>
> Sure, I can reproduce the problem with the hint from Nate,
> with slow storage it's even faster to reproduce, I tested with SRP as transport,
> apply blkio throttle on top of raid1 member devices, kernel 4.14.93.
>
> Without patch below, I can reproduce it in less than 3 iterations.
> With patch below, I cant reproduce in more than one hour. I also run
> basic regression
> test, didn't find any problem.
>
> backport note to 4.14:
>  s/md_bitmap_end_sync/bitmap_end_sync
>
> Reviewed-by: Jack Wang <jinpu.wang@xxxxxxxxxxxxxxx>
> Tested-by:Jack Wang <jinpu.wang@xxxxxxxxxxxxxxx>

Thanks Jack!

Song

>
> Regards,
> Jack
>
>
> >
> > > > ---
> > > >  drivers/md/raid1.c | 28 ++++++++++++++++++----------
> > > >  1 file changed, 18 insertions(+), 10 deletions(-)
> > > >
> > > > diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
> > > > index 1d54109071cc..fa47249fa3e4 100644
> > > > --- a/drivers/md/raid1.c
> > > > +++ b/drivers/md/raid1.c
> > > > @@ -1863,6 +1863,20 @@ static void end_sync_read(struct bio *bio)
> > > >                 reschedule_retry(r1_bio);
> > > >  }
> > > >
> > > > +static void abort_sync_write(struct mddev *mddev, struct r1bio *r1_bio)
> > > > +{
> > > > +       sector_t sync_blocks = 0;
> > > > +       sector_t s = r1_bio->sector;
> > > > +       long sectors_to_go = r1_bio->sectors;
> > > > +
> > > > +       /* make sure these bits don't get cleared. */
> > > > +       do {
> > > > +               md_bitmap_end_sync(mddev->bitmap, s, &sync_blocks, 1);
> > > > +               s += sync_blocks;
> > > > +               sectors_to_go -= sync_blocks;
> > > > +       } while (sectors_to_go > 0);
> > > > +}
> > > > +
> > > >  static void end_sync_write(struct bio *bio)
> > > >  {
> > > >         int uptodate = !bio->bi_status;
> > > > @@ -1874,15 +1888,7 @@ static void end_sync_write(struct bio *bio)
> > > >         struct md_rdev *rdev = conf->mirrors[find_bio_disk(r1_bio, bio)].rdev;
> > > >
> > > >         if (!uptodate) {
> > > > -               sector_t sync_blocks = 0;
> > > > -               sector_t s = r1_bio->sector;
> > > > -               long sectors_to_go = r1_bio->sectors;
> > > > -               /* make sure these bits doesn't get cleared. */
> > > > -               do {
> > > > -                       md_bitmap_end_sync(mddev->bitmap, s, &sync_blocks, 1);
> > > > -                       s += sync_blocks;
> > > > -                       sectors_to_go -= sync_blocks;
> > > > -               } while (sectors_to_go > 0);
> > > > +               abort_sync_write(mddev, r1_bio);
> > > >                 set_bit(WriteErrorSeen, &rdev->flags);
> > > >                 if (!test_and_set_bit(WantReplacement, &rdev->flags))
> > > >                         set_bit(MD_RECOVERY_NEEDED, &
> > > > @@ -2172,8 +2178,10 @@ static void sync_request_write(struct mddev
> > > > *mddev, struct r1bio *r1_bio)
> > > >                      (i == r1_bio->read_disk ||
> > > >                       !test_bit(MD_RECOVERY_SYNC, &mddev->recovery))))
> > > >                         continue;
> > > > -               if (test_bit(Faulty, &conf->mirrors[i].rdev->flags))
> > > > +               if (test_bit(Faulty, &conf->mirrors[i].rdev->flags)) {
> > > > +                       abort_sync_write(mddev, r1_bio);
> > > >                         continue;
> > > > +               }
> > > >
> > > >                 bio_set_op_attrs(wbio, REQ_OP_WRITE, 0);
> > > >                 if (test_bit(FailFast, &conf->mirrors[i].rdev->flags))
> > > > --
> > > > 2.19.1
> > >
> > > Thanks Nate, the patch looks good to me!
> > >
> > > Do you have a reproducer for the bug here? It would good to have  a
> > > test case in blktest or mdadm test.
> > >
> > > As this could lead to data corruption, it should go to stable 4.14+, I
> > > guess Liu Song will mark it for stable :)
> > >
> > > Thanks.
> > >
> > > --
> > > Jack Wang
> > > Linux Kernel Developer
> > >
> > > 1&1 IONOS Cloud GmbH | Greifswalder Str. 207 | 10405 Berlin | Germany
> > > Phone: +49 30 57700-8042 | Fax: +49 30 57700-8598
> > > E-mail: jinpu.wang@xxxxxxxxxxxxxxx | Web: www.ionos.de
> > >
> > >
> > > Head Office: Berlin, Germany
> > > District Court Berlin Charlottenburg, Registration number: HRB 125506 B
> > > Executive Management: Christoph Steffens, Matthias Steinberg, Achim Weiss
> > >
> > > Member of United Internet
> > >
> > > This e-mail may contain confidential and/or privileged information. If
> > > you are not the intended recipient of this e-mail, you are hereby
> > > notified that saving, distribution or use of the content of this
> > > e-mail in any way is prohibited. If you have received this e-mail in
> > > error, please notify the sender and delete the e-mail.
>
>
>
> --
> Jack Wang
> Linux Kernel Developer
>
> 1&1 IONOS Cloud GmbH | Greifswalder Str. 207 | 10405 Berlin | Germany
> Phone: +49 30 57700-8042 | Fax: +49 30 57700-8598
> E-mail: jinpu.wang@xxxxxxxxxxxxxxx | Web: www.ionos.de
>
>
> Head Office: Berlin, Germany
> District Court Berlin Charlottenburg, Registration number: HRB 125506 B
> Executive Management: Christoph Steffens, Matthias Steinberg, Achim Weiss
>
> Member of United Internet
>
> This e-mail may contain confidential and/or privileged information. If
> you are not the intended recipient of this e-mail, you are hereby
> notified that saving, distribution or use of the content of this
> e-mail in any way is prohibited. If you have received this e-mail in
> error, please notify the sender and delete the e-mail.