On Mon, 18 Jun 2012 11:13:48 +0800 majianpeng <majianpeng@xxxxxxxxx> wrote: > If rdev became blocked because the unack badblocks, it did not exec > md_wait_for_blocked_rdev in handle_stripe().So the rdev->nr_pending did > not decrease.So rdev did not remove because the wrong nr_pending. > Signed-off-by: majianpeng <majianpeng@xxxxxxxxx> > --- > drivers/md/raid5.c | 3 ++- > 1 files changed, 2 insertions(+), 1 deletions(-) > > diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c > index d267672..ed63261 100644 > --- a/drivers/md/raid5.c > +++ b/drivers/md/raid5.c > @@ -3582,7 +3582,8 @@ static void handle_stripe(struct stripe_head *sh) > > finish: > /* wait for this device to become unblocked */ > - if (conf->mddev->external && unlikely(s.blocked_rdev)) > + if (unlikely(s.blocked_rdev) && (conf->mddev->external || > + test_bit(BlockedBadBlocks, &(s.blocked_rdev->flags)))) > md_wait_for_blocked_rdev(s.blocked_rdev, conf->mddev); > > if (s.handle_bad_blocks) Thanks for finding this. However I don't think your patch is quite correct. It would re-introduce a hang fixed by commit 43220aa0f22cd3ce5b3. I've applied the following instead. Thanks, NeilBrown From 0cee6aeb02b1ef947be62bb455f64720ecba2b4c Mon Sep 17 00:00:00 2001 From: NeilBrown <neilb@xxxxxxx> Date: Wed, 27 Jun 2012 13:43:54 +1000 Subject: [PATCH] md/raid5: fix refcount problem when blocked_rdev is set. commit 43220aa0f22cd3ce5b30246d50ccd696d119edea md/raid5: fix a hang on device failure. fixed a hang, but introduced a refcounting in balance so that if the presence of bad-blocks ever caused an rdev to be 'blocked' we would increment the refcount on the rdev and never decrement it. So added the needed rdev_dec_pending when md_wait_for_blocked_rdev is not called. Reported-by: majianpeng <majianpeng@xxxxxxxxx> Signed-off-by: NeilBrown <neilb@xxxxxxx> diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c index befadb4..e23cd59 100644 --- a/drivers/md/raid5.c +++ b/drivers/md/raid5.c @@ -3588,8 +3588,17 @@ static void handle_stripe(struct stripe_head *sh) finish: /* wait for this device to become unblocked */ - if (conf->mddev->external && unlikely(s.blocked_rdev)) - md_wait_for_blocked_rdev(s.blocked_rdev, conf->mddev); + if (unlikely(s.blocked_rdev)) { + if (conf->mddev->external) + md_wait_for_blocked_rdev(s.blocked_rdev, + conf->mddev); + else + /* Internal metadata will immediately + * be written by raid5d, so we don't + * need to wait here. + */ + rdev_dec_pending(rdev, mddev); + } if (s.handle_bad_blocks) for (i = disks; i--; ) {
Attachment:
signature.asc
Description: PGP signature