Re: [PATCH 0/2] raid1/10: Handle write errors correctly in narrow_write_error()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Nate Dailey <nate.dailey@xxxxxxxxxxx> writes:

> The problem is that we aren't getting true write (medium) errors.
>
> In this case we're testing device removals. The write errors happen because the 
> disk goes away. Narrow_write_error returns 1, the bitmap bit is cleared, and 
> then when the device is re-added the resync might not include the sectors in 
> that chunk (there's some luck involved; if other writes to that chunk happen 
> while the disk is removed, we're okay--bug is easier to hit with smaller bitmap 
> chunks because of this).
>
>
OK, that makes sense.

The device removal will be noticed when the bad block log is written
out.
When a bad-block is recorded we make sure to write that out promptly
before bio_endio() gets called.  But not before close_write() has called
bitmap_end_write().

So I guess we need to delay the close_write() call until the
bad-block-log has been written.

I think this patch should do it.  Can you test?

Thanks,
NeilBrown

diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
index c1ad0b075807..1a1c5160c930 100644
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -2269,8 +2269,6 @@ static void handle_write_finished(struct r1conf *conf, struct r1bio *r1_bio)
 			rdev_dec_pending(conf->mirrors[m].rdev,
 					 conf->mddev);
 		}
-	if (test_bit(R1BIO_WriteError, &r1_bio->state))
-		close_write(r1_bio);
 	if (fail) {
 		spin_lock_irq(&conf->device_lock);
 		list_add(&r1_bio->retry_list, &conf->bio_end_io_list);
@@ -2396,6 +2394,9 @@ static void raid1d(struct md_thread *thread)
 			r1_bio = list_first_entry(&tmp, struct r1bio,
 						  retry_list);
 			list_del(&r1_bio->retry_list);
+			if (mddev->degraded)
+				set_bit(R1BIO_Degraded, &r1_bio->state);
+			close_write(r1_bio);
 			raid_end_bio_io(r1_bio);
 		}
 	}

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux