Re: [PATCH] xfs: fix internal error from AGFL exhaustion

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Oct 25, 2023 at 08:21:02PM -0700, Darrick J. Wong wrote:
> On Thu, Oct 26, 2023 at 10:37:39AM +1100, Dave Chinner wrote:
> > On Wed, Oct 25, 2023 at 03:39:09PM -0700, Darrick J. Wong wrote:
> > > On Wed, Oct 25, 2023 at 01:03:00PM -0700, Omar Sandoval wrote:
> > > > On Wed, Oct 25, 2023 at 08:50:46AM -0700, Darrick J. Wong wrote:
> > > > > The absolute maximum height of a free space btree is 7 blocks, according
> > > > > to xfs_db:
> > > > > 
> > > > > # xfs_db /dev/sda -c 'btheight -w absmax all'
> > > > > bnobt: 7
> > > > > cntbt: 7
> > > > > inobt: 6
> > > > > finobt: 6
> > > > > bmapbt: 14
> > > > > refcountbt: 6
> > > > > rmapbt: 10
> > > > > 
> > > > > The smallest AGFL is 62 records long (V4 fs, 512b blocks) so I don't
> > > > > think it's /that/ drastic.  For a filesystem with rmapbt (V5, 1k blocks)
> > > > > that minimum jumps to 121 blocks.
> > 
> > Yup. keep in mind this comment in xfs_alloc_fix_freelist():
> > 
> > 	 * XXX (dgc): When we have lots of free space, does this buy us
> >          * anything other than extra overhead when we need to put more blocks
> >          * back on the free list? Maybe we should only do this when space is
> >          * getting low or the AGFL is more than half full?
> > 
> > IOWs, I've previously mused about keeping the AGFL much fuller than
> > the absolute minimum. There doesn't seem to be any real gain to
> > keeping it at just the minimum size when we are nowhere near ENOSPC,
> > it just means we are doing lots of management operations like
> > reducing it immediately after a transaction that has released blocks
> > to the freelist, then only to have to extend it again when we do an
> > operation that consumes blocks from the free list.
> > 
> > i.e. should we add also hysteresis to limit the number of AGFL
> > size manipulations we need to do in mixed workloads? i.e. we don't free
> > blocks from the AGFL until it might get overfilled by an operation
> > (i.e. max size - max merge free blocks) or we are nearing ENOSPC,
> > and we don't fill till we are at the minimum. In either case, the
> > target for empty or fill operations is "half full"...
> 
> Hmm.   What if we kept the AGFL completely full?

It can't be kept that way - we have to have space to store blocks
that are released by tree merges. i.e. like a tree split requires a
minimum number blocks in the free list for it to succeed, a tree
merge needs a minimum number of free slots in the AGFL for it to
succeed.

> On that 512b-fsblock V4 filesystem, that'd use up 128 blocks instead of 7
> blocks.  128 blocks == 64K per AG.
> 
> On a 64K-fsblock V5 filesystem, that'd use up 16370 64K blocks per AG, or
> ~1GB per AG.  Ok, that might be too much.
> 
> On a 4K-fsblock V5 fs, that's 1010 4K blocks per AG, or ~4MB per AG.
> 
> A gigabyte is a bit ridiculous, but what if we always did the maximum
> bnobt/cntbt/rmapbt height instead of current_height + 1?

Yes, largely full AGFLs all the time doesn't make a whole lot of
sense, but just a max height reservation still doesn't cover the
case exposed by this fix. It would need to be 2 * max height
to cover the double split case we are talking about here.

Also, if we change from the current height to always use the
maximum, it re-introduces a bunch of ENOSPC issues resulting from
overcommitting user data space. That is, we must ensure that global
free space does not exhaust the free space that we need to refill
the AGFLs to enable an extent free at ENOSPC to proceed. i.e.
the mp->m_alloc_set_aside value (i.e. XFS_ALLOCBT_AGFL_RESERVE)
needs to reflect the fact this would change the minimum AGFL fill at
ENOSPC.

Further, changing to max reservation for the minimum fill fails to
recognoise that at ENOSPC, the free space trees are empty and so
will never split, and so a large minimum AGFL fill is completely
unnecessary - at most all we need is to have enough blocks in the
AGFL for the single level root split to occur (i.e. 2 blocks per
free space btree). That's waht the current XFS_ALLOCBT_AGFL_RESERVE
value is set to.

So, really, I don't see moving away from the variable minimum
reservation we currently have is a good idea. Adding a bit of
hysteresis to the AGFL fill/drain operations doesn't require
changing the min fill requirements, just picking a sensible target
above the current minimum fill...

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux