Re: [PATCH 3/3 v2] md: unblock array if bad blocks have been acknowledged

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Oct 19, 2016 at 10:28:18PM -0700, Shaohua Li wrote:
> On Tue, Oct 18, 2016 at 04:10:24PM +0200, Tomasz Majchrzak wrote:
> > Once external metadata handler acknowledges all bad blocks (by writing
> > to rdev 'bad_blocks' sysfs file), it requests to unblock the array.
> > Check if all bad blocks are actually acknowledged as there might be a
> > race if new bad blocks are notified at the same time. If all bad blocks
> > are acknowledged, just unblock the array and continue. If not, ignore
> > the request to unblock (do not fail an array). External metadata handler
> > is expected to either process remaining bad blocks and try to unblock
> > again or remove bad block support for a disk (which will cause disk to
> > fail as in no-support case).
> > 
> > Signed-off-by: Tomasz Majchrzak <tomasz.majchrzak@xxxxxxxxx>
> > Reviewed-by: Artur Paszkiewicz <artur.paszkiewicz@xxxxxxxxx>
> > ---
> >  drivers/md/md.c | 24 +++++++++++++++++-------
> >  1 file changed, 17 insertions(+), 7 deletions(-)
> > 
> > diff --git a/drivers/md/md.c b/drivers/md/md.c
> > index cc05236..ce585b7 100644
> > --- a/drivers/md/md.c
> > +++ b/drivers/md/md.c
> > @@ -2612,19 +2612,29 @@ state_store(struct md_rdev *rdev, const char *buf, size_t len)
> >  		set_bit(Blocked, &rdev->flags);
> >  		err = 0;
> >  	} else if (cmd_match(buf, "-blocked")) {
> > -		if (!test_bit(Faulty, &rdev->flags) &&
> > +		int unblock = 1;
> > +		int acked = !rdev->badblocks.unacked_exist;
> > +
> > +		if ((test_bit(ExternalBbl, &rdev->flags) &&
> > +		     rdev->badblocks.changed))
> > +			acked = check_if_badblocks_acked(&rdev->badblocks);
> > +
> > +		if (test_bit(ExternalBbl, &rdev->flags) && !acked) {
> > +			unblock = 0;
> > +		} else if (!test_bit(Faulty, &rdev->flags) &&
> 
> I missed one thing in last review. writing to bad_blocks sysfs file already
> clears the BlockedBadBlocks bit and wakeup the thread sleeping at blocked_wait,
> so the array can continue. Why do we need to fix state_store here?

We cannot unblock the rdev until all bad blocks are acknowledged. The problem is
mdadm cannot be sure it has stored all bad blocks in the first pass (read of
unacknowledged_bad_blocks sysfs file). When bad block is encountered, rdev
enters Blocked, Faulty state (unacked_exists is non-zero in state_show). Then
mdadm reads the bad block, stores it in metadata and acknowledges it to the
kernel. Initially I have tried to call ack_all_badblocks in bb_store or in
state_store("-blocked") but there was a race. If other requests (the ones that
had started before array got into blocked state) notified bad blocks after sysfs
file was read by mdadm but before ack_all_badblocks call, ack_all_badblocks call
was also acknowledging bad blocks not stored (and never to be as a result) in
metadata. That's why I have introduced a new function
check_if_all_badblocks_acked to close this race.

I'm not sure if native bad block support is not prone to the similar problem -
bad block structure modified between metadata sync and ack_all_badblocks call.

As for BlockedBadBlocks flag cleared in bb_store, commit de393cdea66c ("md: make
it easier to wait for bad blocks to be acknowledged") explains this flag is only
an advisory. All awaiting requests are woken up and check if bad block that
interests them is already acknowledged. If so, then can continue, and if not,
they set the flag again to check in a while. It is just a useful optimization.

Please note that rdev with unacknowledged bad block is reported as Blocked via
sysfs state (non-zero unacked_exists), even though the corresponding rdev kernel
flag is not set. It is the reason why mdadm calls state_store("-blocked").

I hope it clarifies all your doubts.

Tomek
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux