From: Heinz Mauelshagen <heinzm@xxxxxxxxxx>
Hi Neil,
the reconstruct write optimization in raid5, function fetch_block causes
livelocks in LVM raid4/5 tests.
Test scenarios:
the tests wait for full initial array resynchronization before making a
filesystem
on the raid4/5 logical volume, mounting it, writing to the filesystem
and failing
one physical volume holding a raiddev.
In short, we're seeing livelocks on fully synchronized raid4/5 arrays
with a failed device.
This patch fixes the issue but likely in a suboptimnal way.
Do you think there is a better solution to avoid livelocks on
reconstruct writes?
Regards,
Heinz
Signed-off-by: Heinz Mauelshagen <heinzm@xxxxxxxxxx>
Tested-by: Jon Brassow <jbrassow@xxxxxxxxxx>
Tested-by: Heinz Mauelshagen <heinzm@xxxxxxxxxx>
---
drivers/md/raid5.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index c1b0d52..0fc8737 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -2915,7 +2915,7 @@ static int fetch_block(struct stripe_head *sh,
struct stripe_head_state *s,
(s->failed >= 1 && fdev[0]->toread) ||
(s->failed >= 2 && fdev[1]->toread) ||
(sh->raid_conf->level <= 5 && s->failed && fdev[0]->towrite &&
- (!test_bit(R5_Insync, &dev->flags) ||
test_bit(STRIPE_PREREAD_ACTIVE, &sh->state)) &&
+ (!test_bit(R5_Insync, &dev->flags) ||
test_bit(STRIPE_PREREAD_ACTIVE, &sh->state) || s->non_overwrite) &&
!test_bit(R5_OVERWRITE, &fdev[0]->flags)) ||
((sh->raid_conf->level == 6 ||
sh->sector >= sh->raid_conf->mddev->recovery_cp)
--
2.1.0
--
dm-devel mailing list
dm-devel@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/dm-devel