Re: Known and unfixed active data loss bug in MM + XFS with large folios since Dec 2021 (any kernel from 6.1 upwards)

Matthew Wilcox <willy@xxxxxxxxxxxxx> · Tue, 17 Sep 2024 14:25:17 +0100

On Tue, Sep 17, 2024 at 01:13:05PM +0200, Chris Mason wrote:
> On 9/17/24 5:32 AM, Matthew Wilcox wrote:
> > On Mon, Sep 16, 2024 at 10:47:10AM +0200, Chris Mason wrote:
> >> I've got a bunch of assertions around incorrect folio->mapping and I'm
> >> trying to bash on the ENOMEM for readahead case.  There's a GFP_NOWARN
> >> on those, and our systems do run pretty short on ram, so it feels right
> >> at least.  We'll see.
> > 
> > I've been running with some variant of this patch the whole way across
> > the Atlantic, and not hit any problems.  But maybe with the right
> > workload ...?
> > 
> > There are two things being tested here.  One is whether we have a
> > cross-linked node (ie a node that's in two trees at the same time).
> > The other is whether the slab allocator is giving us a node that already
> > contains non-NULL entries.
> > 
> > If you could throw this on top of your kernel, we might stand a chance
> > of catching the problem sooner.  If it is one of these problems and not
> > something weirder.
> > 
> 
> This fires in roughly 10 seconds for me on top of v6.11.  Since array seems
> to always be 1, I'm not sure if the assertion is right, but hopefully you
> can trigger yourself.

Whoops.

$ git grep XA_RCU_FREE
lib/xarray.c:#define XA_RCU_FREE        ((struct xarray *)1)
lib/xarray.c:   node->array = XA_RCU_FREE;

so you walked into a node which is currently being freed by RCU.  Which
isn't a problem, of course.  I don't know why I do that; it doesn't seem
like anyone tests it.  The jetlag is seriously kicking in right now,
so I'm going to refrain from saying anything more because it probably
won't be coherent.