Re: raid5 lockups post ca64cae96037de16e4af92678814f5d4bf0c1c65

NeilBrown <neilb@xxxxxxx> · Tue, 12 Mar 2013 09:32:31 +1100

On Wed, 06 Mar 2013 10:31:55 +0100 Jes Sorensen <Jes.Sorensen@xxxxxxxxxx>
wrote:

> NeilBrown <neilb@xxxxxxx> writes:
> > On Tue, 05 Mar 2013 09:44:54 +0100 Jes Sorensen <Jes.Sorensen@xxxxxxxxxx>
> > wrote:
> >> > Does this fix it?
> >> >
> >> > NeilBrown
> >> 
> >> Unfortunately no, I still see these crashes with this one applied :(
> >> 
> >
> > Thanks - the symptom looked  similar, but now that I look more closely I can
> > see it is quite different.
> >
> > How about this then?  I can't really see what is happening, but based on the
> > patch that you identified it must be related to these flags.
> > It seems that handle_stripe_clean_event() is being called to early, and it
> > doesn't clear out the ->written bios because they are still locked or
> > something.  But it does clear R5_Discard on the parity block, so
> > handle_stripe_clean_event doesn't get called again.
> >
> > This makes the handling of the various flags somewhat more uniform, which is
> > probably a good thing.
> 
> Hi Neil,
> 
> With this one applied I end up with an OOPS instead. Note I had to
> modify the last test/clear bit sequence to use &sh->dev[i].flags instead
> of &dev->flags to avoid a compiler warning.

Oops.

> 
> I am attaching the test script I am running too. It was written by Eryu
> Guan.

Thanks for that.  I've tried using it but haven't managed to trigger a BUG
yet.  What size are the loop files?  I mostly use fairly small ones, but
maybe it needs to be bigger to trigger the problem.

My current guess is that recovery and discard are both called on the stripe
at the same time and they race and leave the stripe in a slightly confused
state.  But I haven't found the exact state yet.

The discard code always attaches a 'discard' request to every device in a
stripe_head all at once, under stripe_lock.  However when ops_bio_drain picks
those discard requests of ->towrite and puts them on ->written, it takes
stripe_lock once per device.
Maybe we just need to change the coverage to stripe_lock there to be held for
the entire loop.  We would still want to drop it before calling
async_copy_data() and reclaim afterwards, but that wouldn't affect the
'discard' case.

> 
> 
> [ 2623.554780] kernel BUG at drivers/md/raid5.c:2954!

Could you confirm exactly which line this was - there are a few BUG_ON()s
around there.  They are all related to R5_UPTODATE not being set I think,
but it might help to know exactly when it isn't set.

Thanks,
NeilBrown

Attachment:
signature.asc

Description: PGP signature