[RFC PATCH 0/4] bringing back the AGFL reserve

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,
One of the teams that I work with hits WARNs in
xfs_bmap_extents_to_btree() on a database workload of theirs.  The last
time the subject came up on linux-xfs, it was suggested[1] to try
building an AG reserve pool for the AGFL.

I managed to work out a reproducer for the problem.  Debugging that, the
steps Gao outlined turned out to be essentially what was necessary to
get the problem to happen repeatably.

1. Allocate almost all of the space in an AG
2. Free and reallocate that space to fragement it so the freespace
b-trees are just about to split.
3. Allocate blocks in a file such that the next extent allocated for
that file will cause its bmbt to get converted from an inline extent to
a b-tree.
4. Free space such that the free-space btrees have a contiguous extent
with a busy portion on either end
5. Allocate the portion in the middle, splitting the extent and
triggering a b-tree split.

On older kernels this is all it takes.  After the AG-aware allocator
changes I also need to start the allocation in the highest numbered AG
available while inducing lock contention in the lower numbered AGs.

In order to ensure that AGs have enough space to complete transactions
with multiple allocations, I've taken a stab at implementing an AGFL
reserve pool.

This patchset passes fstests without any regressions and also does not
trigger the reproducers I wrote for the case above.  I've also run those
with tracing enabled to validate that it's got the accounting correct,
is rejecting allocations when there's no space in the reserve, and is
tapping the reserve when appropriate.

The first patch is the plumbing that re-establishes the reserve for the
AGFL.  I'm happy to break this into something smaller, if it's too
large.  The remaining patches add additional pieces needed to check how
much space the AGFL might need on a refill, and then to actually use the
reserve to permit or deny allocation requests, as the case may be.

I'm sending this as an RFC, since I still have a few outstanding
questions and would appreciate feedback.

Some of those questions are:

Patch 1 includes all freespace that is not allocated to the rmapbt in
its used / reserved accounting.  It also borrows the heuristics from
rmapbt in terms of picking the initial size of the reservation.  The
numbers I'm getting seem a bit large.  Any suggestions about how to
improve this further?

Patches 3 and 4 use the allocation args structure to attempt to decide
whether an allocation is the first in a transaction, or if its a
subsequent allocation.  Are there any recommendations about a better way
to do this?

Thanks,

-K


[1] https://lore.kernel.org/linux-xfs/20221116025106.GB3600936@xxxxxxxxxxxxxxxxxxx/


Krister Johansen (4):
  xfs: resurrect the AGFL reservation
  xfs: modify xfs_alloc_min_freelist to take an increment
  xfs: let allocations tap the AGFL reserve
  xfs: refuse allocations without agfl refill space

 fs/xfs/libxfs/xfs_ag.h          |  2 +
 fs/xfs/libxfs/xfs_ag_resv.c     | 54 ++++++++++++++-----
 fs/xfs/libxfs/xfs_ag_resv.h     |  4 ++
 fs/xfs/libxfs/xfs_alloc.c       | 94 +++++++++++++++++++++++++++++----
 fs/xfs/libxfs/xfs_alloc.h       |  5 +-
 fs/xfs/libxfs/xfs_alloc_btree.c | 59 +++++++++++++++++++++
 fs/xfs/libxfs/xfs_alloc_btree.h |  5 ++
 fs/xfs/libxfs/xfs_bmap.c        |  2 +-
 fs/xfs/libxfs/xfs_ialloc.c      |  2 +-
 fs/xfs/libxfs/xfs_rmap_btree.c  |  5 ++
 fs/xfs/scrub/fscounters.c       |  1 +
 11 files changed, 207 insertions(+), 26 deletions(-)


base-commit: 58f880711f2ba53fd5e959875aff5b3bf6d5c32e
-- 
2.25.1





[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux