Re: [PATCH v2] xfs: fix internal error from AGFL exhaustion

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Oct 31, 2023 at 02:24:00PM -0700, Darrick J. Wong wrote:
> On Mon, Oct 30, 2023 at 02:00:02PM -0700, Omar Sandoval wrote:
> > From: Omar Sandoval <osandov@xxxxxx>
> > 
> > We've been seeing XFS errors like the following:
> > 
> > XFS: Internal error i != 1 at line 3526 of file fs/xfs/libxfs/xfs_btree.c.  Caller xfs_btree_insert+0x1ec/0x280
> > ...
> > Call Trace:
> >  xfs_corruption_error+0x94/0xa0
> >  xfs_btree_insert+0x221/0x280
> >  xfs_alloc_fixup_trees+0x104/0x3e0
> >  xfs_alloc_ag_vextent_size+0x667/0x820
> >  xfs_alloc_fix_freelist+0x5d9/0x750
> >  xfs_free_extent_fix_freelist+0x65/0xa0
> >  __xfs_free_extent+0x57/0x180
> > ...
> > 
> > This is the XFS_IS_CORRUPT() check in xfs_btree_insert() when
> > xfs_btree_insrec() fails.
> > 
> > After converting this into a panic and dissecting the core dump, I found
> > that xfs_btree_insrec() is failing because it's trying to split a leaf
> > node in the cntbt when the AG free list is empty. In particular, it's
> > failing to get a block from the AGFL _while trying to refill the AGFL_.
> > 
> > If a single operation splits every level of the bnobt and the cntbt (and
> > the rmapbt if it is enabled) at once, the free list will be empty. Then,
> > when the next operation tries to refill the free list, it allocates
> > space. If the allocation does not use a full extent, it will need to
> > insert records for the remaining space in the bnobt and cntbt. And if
> > those new records go in full leaves, the leaves (and potentially more
> > nodes up to the old root) need to be split.
> > 
> > Fix it by accounting for the additional splits that may be required to
> > refill the free list in the calculation for the minimum free list size.
> > 
> > P.S. As far as I can tell, this bug has existed for a long time -- maybe
> > back to xfs-history commit afdf80ae7405 ("Add XFS_AG_MAXLEVELS macros
> > ...") in April 1994! It requires a very unlucky sequence of events, and
> > in fact we didn't hit it until a particular sparse mmap workload updated
> > from 5.12 to 5.19. But this bug existed in 5.12, so it must've been
> > exposed by some other change in allocation or writeback patterns. It's
> > also much less likely to be hit with the rmapbt enabled, since that
> > increases the minimum free list size and is unlikely to split at the
> > same time as the bnobt and cntbt.
> > 
> > Signed-off-by: Omar Sandoval <osandov@xxxxxx>
> > ---
> > Changes since v1 [1]:
> > 
> > - Updated calculation to account for internal nodes that may also need
> >   to be split.
> > - Updated comments and commit message accordingly.
> > 
> > 1: https://lore.kernel.org/linux-xfs/ZTrSiwF7OAq0hJHn@xxxxxxxxxxxxxxxxxxx/T/
> > 
> >  fs/xfs/libxfs/xfs_alloc.c | 25 ++++++++++++++++++++++---
> >  1 file changed, 22 insertions(+), 3 deletions(-)
> > 
> > diff --git a/fs/xfs/libxfs/xfs_alloc.c b/fs/xfs/libxfs/xfs_alloc.c
> > index 3069194527dd..3949c6ad0cce 100644
> > --- a/fs/xfs/libxfs/xfs_alloc.c
> > +++ b/fs/xfs/libxfs/xfs_alloc.c
> > @@ -2275,16 +2275,35 @@ xfs_alloc_min_freelist(
> >  
> >  	ASSERT(mp->m_alloc_maxlevels > 0);
> >  
> > +	/*
> > +	 * For a btree shorter than the maximum height, the worst case is that
> > +	 * every level gets split and a new level is added, then while inserting
> > +	 * another entry to refill the AGFL, every level under the old root gets
> > +	 * split again. This is:
> > +	 *
> > +	 *   (current height + 1) + (current height - 1)
> 
> Could you make this comment define this relationship more explicitly?
> 
> 	 *   (full height split reservation) + (AGFL refill split height)
> 	 *   (current height + 1) + (current height - 1)
> 
> With that added,
> Reviewed-by: Darrick J. Wong <djwong@xxxxxxxxxx>

Yup, with that comment change I'm happy with it, too.

Reviewed-by: Dave Chinner <dchinner@xxxxxxxxxx>
-- 
Dave Chinner
david@xxxxxxxxxxxxx



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux