On Mon, 17 Sep 2012 14:15:16 +0300 Alexander Lyakas <alex.bolshoy@xxxxxxxxx> wrote: > Hi Neil, > below is a bit less-ugly version of the patch. > Thanks, > Alex. > > >From 05cf800d623bf558c99d542cf8bf083c85b7e5d5 Mon Sep 17 00:00:00 2001 > From: Alex Lyakas <alex@xxxxxxxxxxxxxxxxx> > Date: Thu, 13 Sep 2012 18:55:00 +0300 > Subject: [PATCH] When RAID5 is dirty, force reconstruct-write instead of > read-modify-write. > > Signed-off-by: Alex Lyakas <alex@xxxxxxxxxxxxxxxxx> > Signed-off-by: Yair Hershko <yair@xxxxxxxxxxxxxxxxx> > > diff --git a/ubuntu_kmodules/Ubuntu-3.2.0-25.40/drivers/md/raid5.c > b/ubuntu_kmodules/Ubuntu-3.2.0-25.40/drivers/md/raid5.c > index 5332202..0702785 100644 > --- a/ubuntu_kmodules/Ubuntu-3.2.0-25.40/drivers/md/raid5.c > +++ b/ubuntu_kmodules/Ubuntu-3.2.0-25.40/drivers/md/raid5.c > @@ -2555,12 +2555,36 @@ static void handle_stripe_dirtying(struct r5conf *conf, > int disks) > { > int rmw = 0, rcw = 0, i; > - if (conf->max_degraded == 2) { > - /* RAID6 requires 'rcw' in current implementation > - * Calculate the real rcw later - for now fake it > + sector_t recovery_cp = conf->mddev->recovery_cp; > + unsigned long recovery = conf->mddev->recovery; > + int needed = test_bit(MD_RECOVERY_NEEDED, &recovery); > + int resyncing = test_bit(MD_RECOVERY_SYNC, &recovery) && > + !test_bit(MD_RECOVERY_REQUESTED, &recovery) && > + !test_bit(MD_RECOVERY_CHECK, &recovery); > + int transitional = test_bit(MD_RECOVERY_RUNNING, &recovery) && > + !test_bit(MD_RECOVERY_SYNC, &recovery) && > + !test_bit(MD_RECOVERY_RECOVER, &recovery) && > + !test_bit(MD_RECOVERY_DONE, &recovery) && > + !test_bit(MD_RECOVERY_RESHAPE, &recovery); Thanks Alex, however I don't understand why you want to test all of these bits. Isn't it enough just to check ->recovery_cp ?? > + > + /* RAID6 requires 'rcw' in current implementation. > + * Otherwise, attempt to check whether resync is now happening > + * or should start. > + * If yes, then the array is dirty (after unclean shutdown or > + * initial creation), so parity in some stripes might be inconsistent. > + * In this case, we need to always do reconstruct-write, to ensure > + * that in case of drive failure or read-error correction, we > + * generate correct data from the parity. > + */ > + if (conf->max_degraded == 2 || > + (recovery_cp < MaxSector && sh->sector >= recovery_cp && > + (needed || resyncing || transitional))) { > + /* Calculate the real rcw later - for now fake it > * look like rcw is cheaper Also, we should probably fix this comment. s/fake/make/ Thanks, NeilBrown > */ > rcw = 1; rmw = 2; > + pr_debug("force RCW max_degraded=%u, recovery_cp=%lu > sh->sector=%lu recovery=0x%lx\n", > + conf->max_degraded, recovery_cp, sh->sector, recovery); > } else for (i = disks; i--; ) { > /* would I have to read this buffer for read_modify_write */ > struct r5dev *dev = &sh->dev[i];
Attachment:
signature.asc
Description: PGP signature