Re: dm raid1: "mirror" target doesn't use all available legs on multiple failures

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 10/10/2016 10:41 PM, Mike Snitzer wrote:
On Mon, Oct 10 2016 at 12:48pm -0400,
Heinz Mauelshagen <heinzm@xxxxxxxxxx> wrote:

In case legs of a "mirror" target fail, any read will cause a new,
operational default leg to be selected and the read be resubmitted
to it. If that new default leg fails the read too, no other still
accessible legs are used to resubmit the read again thus failing
the io.

Fix by allowing the read to get resubmitted until there's no
operational legs any more.

Resolves: rhbz1383444

Signed-off-by: Heinz Mauelshagen <heinzm@xxxxxxxxxx>
Nothing seems to be checking bio_record->details.bi_bdev anymore.
(The one you've removed really seems like a complete hack to begin with)

The very point of removing "if (!bio_record->details.bi_bdev) { ..." is
to  allow for resubmission of read bios to any other still operational leg
after an initial resubmission failed.
do_reads() will conditionally error the bio in case there are none or
no in sync ones available.

The "mirror" processing for read bios is:
1) bio gets remapped to the default mirror in mirror_map() if region is
    in sync our queued to the worker for submission
2) mirror_end_io() spots read error and requeues to the worker
3) worker
    - submits bio to new, selected default if respective region is in sync
    or
    - fails bio if it aims at a non synchronized region
4) if not already failed in do_reads(),
    mirror_end_io() processes like in step 1 _but_
errors the read because of the "if (!bio_record->details.bi_bdev) { ..."


My patch changes step 4 by removing the conditional to resubmit it
again in case any other operational leg is available. If none's available,
mirror_end_io() errors the read bios or the do_reads error logic applies
in case of a resubmission.

Why do you think this to be a hack?


Shouldn't this patch go further by removing the other 2 places that set
bio_record->details.bi_bdev = NULL; ?

Yes, that's a good cleanup.

Let's first settle any still open issues like other unspotted side effects
and I'll do a patch v2 with it.


---
  drivers/md/dm-raid1.c | 11 -----------
  1 file changed, 11 deletions(-)

diff --git a/drivers/md/dm-raid1.c b/drivers/md/dm-raid1.c
index 7a6254d..dd31019 100644
--- a/drivers/md/dm-raid1.c
+++ b/drivers/md/dm-raid1.c
@@ -1266,16 +1266,6 @@ static int mirror_end_io(struct dm_target *ti, struct bio *bio, int error)
  		goto out;
if (unlikely(error)) {
-		if (!bio_record->details.bi_bdev) {
-			/*
-			 * There wasn't enough memory to record necessary
-			 * information for a retry or there was no other
-			 * mirror in-sync.
-			 */
-			DMERR_LIMIT("Mirror read failed.");
-			return -EIO;
-		}
-
  		m = bio_record->m;
DMERR("Mirror read failed from %s. Trying alternative device.",
@@ -1291,7 +1281,6 @@ static int mirror_end_io(struct dm_target *ti, struct bio *bio, int error)
  			bd = &bio_record->details;
dm_bio_restore(bd, bio);
-			bio_record->details.bi_bdev = NULL;
  			bio->bi_error = 0;
queue_bio(ms, bio, rw);
--
2.7.4

--
dm-devel mailing list
dm-devel@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/dm-devel

--
dm-devel mailing list
dm-devel@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/dm-devel



[Index of Archives]     [DM Crypt]     [Fedora Desktop]     [ATA RAID]     [Fedora Marketing]     [Fedora Packaging]     [Fedora SELinux]     [Yosemite Discussion]     [KDE Users]     [Fedora Docs]

  Powered by Linux