Hi neil, The patch works. I test it on Centos 7.0 for fifty rounds, no consistency issue found。 Best Regards. jiaohui On Tue, Jul 29, 2014 at 10:44 AM, NeilBrown <neilb@xxxxxxx> wrote: > On Mon, 28 Jul 2014 16:09:33 +0800 jiao hui <jiaohui@xxxxxxxxxxxxx> wrote: > >> >From 1fdbfb8552c00af55d11d7a63cdafbdf1749ff63 Mon Sep 17 00:00:00 2001 >> From: Jiao Hui <simonjiaoh@xxxxxxxxx> >> Date: Mon, 28 Jul 2014 11:57:20 +0800 >> Subject: [PATCH] md/raid1: always set MD_RECOVERY_INTR flag in raid1 error handler to avoid potential data corruption >> >> In the recovery of raid1 with bitmap, if a bitmap bit has a NEEDED or RESYNC flag, >> actual resync io will happen. The sync_thread check each rdev, if any rdev is missing >> or has a FAULTY flag, the array is still_degraded, then the bitmap bit NEEDED flag >> not cleared. Otherwise, we cleared NEEDED flag and set RESYNC flag. The RESYNC flag cleared >> in bitmap_cond_end_sync or bitmap_close_sync. >> >> If the only disk which is being recovered fails again when raid1 recovery is in progress. >> The resync_thread can't find a non-In_sync disk to write, then the remaining recovery skipped. >> RAID1 error handler only set MD_RECOVERY_INTR flag when a In_sync disk fails. But the disk >> being reocvered is non-In_sync, then md_do_sync can't got the INTR singal to break, and the >> mddev->curr_resync is uptodated to max_sectors (mddev->dev_sectors). When raid1 personality >> tries to finish resync process, no bitmap bit with RESYNC flag can set back to NEEDED flag, >> and bitmap_close_sync clear the RESYNC flag. When the disk is added back, the area from >> the offset of last recovery to the end of bitmap-chunk is skipped by resync_thread forever. >> >> Signed-off-by: JiaoHui <jiaohui@xxxxxxxxxxxxx> >> >> --- >> drivers/md/raid1.c | 8 ++++---- >> 1 file changed, 4 insertions(+), 4 deletions(-) >> >> diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c >> index aacf6bf..51d06eb 100644 >> --- a/drivers/md/raid1.c >> +++ b/drivers/md/raid1.c >> @@ -1391,16 +1391,16 @@ static void error(struct mddev *mddev, struct md_rdev *rdev) >> return; >> } >> set_bit(Blocked, &rdev->flags); >> + /* >> + * if recovery is running, make sure it aborts. >> + */ >> + set_bit(MD_RECOVERY_INTR, &mddev->recovery); >> if (test_and_clear_bit(In_sync, &rdev->flags)) { >> unsigned long flags; >> spin_lock_irqsave(&conf->device_lock, flags); >> mddev->degraded++; >> set_bit(Faulty, &rdev->flags); >> spin_unlock_irqrestore(&conf->device_lock, flags); >> - /* >> - * if recovery is running, make sure it aborts. >> - */ >> - set_bit(MD_RECOVERY_INTR, &mddev->recovery); >> } else >> set_bit(Faulty, &rdev->flags); >> set_bit(MD_CHANGE_DEVS, &mddev->flags); > > > Hi, > thanks for the report and the patch. > > If the recovery process gets a write error it will abort the current bitmap > region by calling bitmap_end_sync() in end_sync_write(). > However you are talking about a different situation where a normal IO write > gets and error and fails a drive. Then the recovery aborts without aborting > the current bitmap region. > > I think I would rather fix the bug by calling end_sync_write() at the place > where the recovery decides to abort, as in the following patch. > Would you be able to test it please and confirm that it works? > > A similar fix will probably be needed for raid10. > > Thanks, > NeilBrown > > diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c > index 56e24c072b62..4f007a410f4b 100644 > --- a/drivers/md/raid1.c > +++ b/drivers/md/raid1.c > @@ -2668,9 +2668,11 @@ static sector_t sync_request(struct mddev *mddev, sector_t sector_nr, int *skipp > > if (write_targets == 0 || read_targets == 0) { > /* There is nowhere to write, so all non-sync > - * drives must be failed - so we are finished > + * drives must be failed - so we are finished. > + * But abort the current bitmap region though. > */ > sector_t rv; > + bitmap_end_sync(mddev->bitmap, sector_nr, &sync_blocks, 1); > if (min_bad > 0) > max_sector = sector_nr + min_bad; > rv = max_sector - sector_nr; -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html