Re: [PATCH] badblocks: fix overlapping check for clearing

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Oct 12 2016, Tomasz Majchrzak wrote:

> On Mon, Oct 10, 2016 at 03:32:58PM -0700, Dan Williams wrote:
>> > On Tue, Sep 06 2016, Tomasz Majchrzak wrote:
>> >> ---
>> >>  block/badblocks.c | 6 ++++--
>> >>  1 file changed, 4 insertions(+), 2 deletions(-)
>> >>
>> >> diff --git a/block/badblocks.c b/block/badblocks.c
>> >> index 7be53cb..b2ffcc7 100644
>> >> --- a/block/badblocks.c
>> >> +++ b/block/badblocks.c
>> >> @@ -354,7 +354,8 @@ int badblocks_clear(struct badblocks *bb, sector_t s, int sectors)
>> >>                * current range.  Earlier ranges could also overlap,
>> >>                * but only this one can overlap the end of the range.
>> >>                */
>> >> -             if (BB_OFFSET(p[lo]) + BB_LEN(p[lo]) > target) {
>> >> +             if ((BB_OFFSET(p[lo]) + BB_LEN(p[lo]) > target) &&
>> >> +                 (BB_OFFSET(p[lo]) <= target)) {
>> >
>> > hmmm..
>> > 'target' is the sector just beyond the set of sectors to remove from the
>> > list.
>> > BB_OFFSET(p[lo]) is the first sector in a range that was found in the
>> > list.
>> > If these are equal, then are aren't clearing anything in this range.
>> > So I would have '<', not '<='.
>> >
>> > I don't think this makes the code wrong as we end up assigning to p[lo]
>> > the value that is already there.  But it might be confusing.
>> >
>> >
>> >>                       /* Partial overlap, leave the tail of this range */
>> >>                       int ack = BB_ACK(p[lo]);
>> >>                       sector_t a = BB_OFFSET(p[lo]);
>> >> @@ -377,7 +378,8 @@ int badblocks_clear(struct badblocks *bb, sector_t s, int sectors)
>> >>                       lo--;
>> >>               }
>> >>               while (lo >= 0 &&
>> >> -                    BB_OFFSET(p[lo]) + BB_LEN(p[lo]) > s) {
>> >> +                    (BB_OFFSET(p[lo]) + BB_LEN(p[lo]) > s) &&
>> >> +                    (BB_OFFSET(p[lo]) <= target)) {
>> >
>> > Ditto.
>> >
>> > But the code is, I think, correct. Just not how I would have written it.
>> > So
>> >
>> >  Acked-by: NeilBrown <neilb@xxxxxxxx>
>> 
>> I agree with the comments to change "<=" to "<".  Tomasz, care to
>> re-send with those changes?
>
> I have just resent the patch with your suggestions included.
>
>> > In the original md context, it would only ever be called on a block that
>> > was already in the list.
>
> Actually MD RAID10 calls it this way. See handle_write_completed, it iterates
> over all copies and clears the bad block if error has not been returned. I have
> a test case which fails for that reason - existing bad block is modified by
> clear block. It is very unlikely to happen in real life as it depends on
> specific layout of bad blocks and their discovery order, however it's a gap that
> needs to be closed.

Ahh, I didn't realize that.  I see that you are correct though.

>
> I had put some effort to see if clearing of non-existing bad block in RAID10 can
> lead to some incorrect behaviour but I haven't found any. It seems that my patch
> is sufficient to fix the problem.

Yes.  Thanks for a lot for sorting this out :-)

NeilBrown

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux