Re: generic/418 regression seen on 5.12-rc3

Eric Whitney <enwlinux@xxxxxxxxx> · Thu, 18 Mar 2021 17:38:08 -0400

* Matthew Wilcox <willy@xxxxxxxxxxxxx>:
> On Thu, Mar 18, 2021 at 02:16:13PM -0400, Eric Whitney wrote:
> > As mentioned in today's ext4 concall, I've seen generic/418 fail from time to
> > time when run on 5.12-rc3 and 5.12-rc1 kernels.  This first occurred when
> > running the 1k test case using kvm-xfstests.  I was then able to bisect the
> > failure to a patch landed in the -rc1 merge window:
> > 
> > (bd8a1f3655a7) mm/filemap: support readpage splitting a page
> 
> Thanks for letting me know.  This failure is new to me.

Sure - it's useful to know that it's new to you.  Ted said he's also going
to test XFS with a large number of generic/418 trials which would be a
useful comparison.  However, he's had no luck as yet reproducing what I've
seen on his Google compute engine test setup running ext4.

> 
> I don't understand it; this patch changes the behaviour of buffered reads
> from waiting on a page with a refcount held to waiting on a page without
> the refcount held, then starting the lookup from scratch once the page
> is unlocked.  I find it hard to believe this introduces a /new/ failure.
> Either it makes an existing failure easier to hit, or there's a subtle
> bug in the retry logic that I'm not seeing.
> 

For keeping Murphy at bay I'm rerunning the bisection from scratch just
to make sure I come out at the same patch.  The initial bisection looked
clean, but when dealing with a failure that occurs probabilistically it's
easy enough to get it wrong.  Is this patch revertable in -rc1 or -rc3?
Ordinarily I like to do that for confirmation.

And there's always the chance that a latent ext4 bug is being hit.

> > Typical test output resulting from a failure looks like:
> > 
> >      QA output created by 418
> >     +cmpbuf: offset 0: Expected: 0x1, got 0x0
> >     +[6:0] FAIL - comparison failed, offset 3072
> >     +diotest -w -b 512 -n 8 -i 4 failed at loop 0
> >      Silence is golden
> >     ...
> > 
> > I've also been able to reproduce the failure on -rc3 in the 4k test case as
> > well.  The failure frequency there was 10 out of 100 runs.  It was anywhere
> > from 2 to 8 failures out of 100 runs in the 1k case.
> > 
> > So, the failure isn't dependent upon block size less than page size.
> 
> That's a good data point.  I'll take a look at g/418 and see if i can
> figure out what race we're hitting.  Nice that it happens so often.
> I suppose I could get you to put some debugging in -- maybe dumping the
> page if we hit a contended case, then again if we're retrying?
> 
> I presume it doesn't always happen at the same offset or anything
> convenient like that.

I'd be very happy to run whatever debugging patches you might want, though
you might want to wait until I've reproduced the bisection result.  The
offsets vary, unfortunately - I've seen 1024, 2048, and 3072 reported when
running a file system with 4k blocks.

Thanks,
Eric