On Tue, Feb 12, 2019 at 7:07 AM Jinpu Wang <jinpu.wang@xxxxxxxxxxxxxxx> wrote: > > On Mon, Feb 11, 2019 at 9:36 PM Song Liu <liu.song.a23@xxxxxxxxx> wrote: > > > > On Mon, Feb 11, 2019 at 1:11 AM Jinpu Wang <jinpu.wang@xxxxxxxxxxxxxxx> wrote: > > > > > > > > > > > sync_request_write no longer submits writes to a Faulty device. This has > > > > the unfortunate side effect that bitmap bits can be incorrectly cleared > > > > if a recovery is interrupted (previously, end_sync_write would have > > > > prevented this). This means the next recovery may not copy everything > > > > it should, potentially corrupting data. > > > > > > > > Add a function for doing the proper md_bitmap_end_sync, called from > > > > end_sync_write and the Faulty case in sync_request_write. > > > > > > > > Fixes: 0c9d5b127f69 ("md/raid1: avoid reusing a resync bio after error > > > > handling.") > > > > Signed-off-by: Nate Dailey <nate.dailey@xxxxxxxxxxx> > > > > Thanks Jinpu! I will take care of this patch and cc stable. > > > > Would you like to add your Reviewed-by/Tested-by/Acked-by? > > > > Song > Hi Song, > > Sure, I can reproduce the problem with the hint from Nate, > with slow storage it's even faster to reproduce, I tested with SRP as transport, > apply blkio throttle on top of raid1 member devices, kernel 4.14.93. > > Without patch below, I can reproduce it in less than 3 iterations. > With patch below, I cant reproduce in more than one hour. I also run > basic regression > test, didn't find any problem. > > backport note to 4.14: > s/md_bitmap_end_sync/bitmap_end_sync > > Reviewed-by: Jack Wang <jinpu.wang@xxxxxxxxxxxxxxx> > Tested-by:Jack Wang <jinpu.wang@xxxxxxxxxxxxxxx> Thanks Jack! Song > > Regards, > Jack > > > > > > > > --- > > > > drivers/md/raid1.c | 28 ++++++++++++++++++---------- > > > > 1 file changed, 18 insertions(+), 10 deletions(-) > > > > > > > > diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c > > > > index 1d54109071cc..fa47249fa3e4 100644 > > > > --- a/drivers/md/raid1.c > > > > +++ b/drivers/md/raid1.c > > > > @@ -1863,6 +1863,20 @@ static void end_sync_read(struct bio *bio) > > > > reschedule_retry(r1_bio); > > > > } > > > > > > > > +static void abort_sync_write(struct mddev *mddev, struct r1bio *r1_bio) > > > > +{ > > > > + sector_t sync_blocks = 0; > > > > + sector_t s = r1_bio->sector; > > > > + long sectors_to_go = r1_bio->sectors; > > > > + > > > > + /* make sure these bits don't get cleared. */ > > > > + do { > > > > + md_bitmap_end_sync(mddev->bitmap, s, &sync_blocks, 1); > > > > + s += sync_blocks; > > > > + sectors_to_go -= sync_blocks; > > > > + } while (sectors_to_go > 0); > > > > +} > > > > + > > > > static void end_sync_write(struct bio *bio) > > > > { > > > > int uptodate = !bio->bi_status; > > > > @@ -1874,15 +1888,7 @@ static void end_sync_write(struct bio *bio) > > > > struct md_rdev *rdev = conf->mirrors[find_bio_disk(r1_bio, bio)].rdev; > > > > > > > > if (!uptodate) { > > > > - sector_t sync_blocks = 0; > > > > - sector_t s = r1_bio->sector; > > > > - long sectors_to_go = r1_bio->sectors; > > > > - /* make sure these bits doesn't get cleared. */ > > > > - do { > > > > - md_bitmap_end_sync(mddev->bitmap, s, &sync_blocks, 1); > > > > - s += sync_blocks; > > > > - sectors_to_go -= sync_blocks; > > > > - } while (sectors_to_go > 0); > > > > + abort_sync_write(mddev, r1_bio); > > > > set_bit(WriteErrorSeen, &rdev->flags); > > > > if (!test_and_set_bit(WantReplacement, &rdev->flags)) > > > > set_bit(MD_RECOVERY_NEEDED, & > > > > @@ -2172,8 +2178,10 @@ static void sync_request_write(struct mddev > > > > *mddev, struct r1bio *r1_bio) > > > > (i == r1_bio->read_disk || > > > > !test_bit(MD_RECOVERY_SYNC, &mddev->recovery)))) > > > > continue; > > > > - if (test_bit(Faulty, &conf->mirrors[i].rdev->flags)) > > > > + if (test_bit(Faulty, &conf->mirrors[i].rdev->flags)) { > > > > + abort_sync_write(mddev, r1_bio); > > > > continue; > > > > + } > > > > > > > > bio_set_op_attrs(wbio, REQ_OP_WRITE, 0); > > > > if (test_bit(FailFast, &conf->mirrors[i].rdev->flags)) > > > > -- > > > > 2.19.1 > > > > > > Thanks Nate, the patch looks good to me! > > > > > > Do you have a reproducer for the bug here? It would good to have a > > > test case in blktest or mdadm test. > > > > > > As this could lead to data corruption, it should go to stable 4.14+, I > > > guess Liu Song will mark it for stable :) > > > > > > Thanks. > > > > > > -- > > > Jack Wang > > > Linux Kernel Developer > > > > > > 1&1 IONOS Cloud GmbH | Greifswalder Str. 207 | 10405 Berlin | Germany > > > Phone: +49 30 57700-8042 | Fax: +49 30 57700-8598 > > > E-mail: jinpu.wang@xxxxxxxxxxxxxxx | Web: www.ionos.de > > > > > > > > > Head Office: Berlin, Germany > > > District Court Berlin Charlottenburg, Registration number: HRB 125506 B > > > Executive Management: Christoph Steffens, Matthias Steinberg, Achim Weiss > > > > > > Member of United Internet > > > > > > This e-mail may contain confidential and/or privileged information. If > > > you are not the intended recipient of this e-mail, you are hereby > > > notified that saving, distribution or use of the content of this > > > e-mail in any way is prohibited. If you have received this e-mail in > > > error, please notify the sender and delete the e-mail. > > > > -- > Jack Wang > Linux Kernel Developer > > 1&1 IONOS Cloud GmbH | Greifswalder Str. 207 | 10405 Berlin | Germany > Phone: +49 30 57700-8042 | Fax: +49 30 57700-8598 > E-mail: jinpu.wang@xxxxxxxxxxxxxxx | Web: www.ionos.de > > > Head Office: Berlin, Germany > District Court Berlin Charlottenburg, Registration number: HRB 125506 B > Executive Management: Christoph Steffens, Matthias Steinberg, Achim Weiss > > Member of United Internet > > This e-mail may contain confidential and/or privileged information. If > you are not the intended recipient of this e-mail, you are hereby > notified that saving, distribution or use of the content of this > e-mail in any way is prohibited. If you have received this e-mail in > error, please notify the sender and delete the e-mail.