On Fri, Apr 22, 2022 at 03:18:20PM -0700, Darrick J. Wong wrote: > On Sat, Apr 23, 2022 at 08:01:20AM +1000, Dave Chinner wrote: > > On Thu, Apr 14, 2022 at 03:54:31PM -0700, Darrick J. Wong wrote: > > > From: Darrick J. Wong <djwong@xxxxxxxxxx> > > > > > > In commit e1a4e37cc7b6, we clamped the length of bunmapi calls on the > > > data forks of shared files to avoid two failure scenarios: one where the > > > extent being unmapped is so sparsely shared that we exceed the > > > transaction reservation with the sheer number of refcount btree updates > > > and EFI intent items; and the other where we attach so many deferred > > > updates to the transaction that we pin the log tail and later the log > > > head meets the tail, causing the log to livelock. > > > > > > We avoid triggering the first problem by tracking the number of ops in > > > the refcount btree cursor and forcing a requeue of the refcount intent > > > item any time we think that we might be close to overflowing. This has > > > been baked into XFS since before the original e1a4 patch. > > > > > > A recent patchset fixed the second problem by changing the deferred ops > > > code to finish all the work items created by each round of trying to > > > complete a refcount intent item, which eliminates the long chains of > > > deferred items (27dad); and causing long-running transactions to relog > > > their intent log items when space in the log gets low (74f4d). > > > > > > Because this clamp affects /any/ unmapping request regardless of the > > > sharing factors of the component blocks, it degrades the performance of > > > all large unmapping requests -- whereas with an unshared file we can > > > unmap millions of blocks in one go, shared files are limited to > > > unmapping a few thousand blocks at a time, which causes the upper level > > > code to spin in a bunmapi loop even if it wasn't needed. > > > > > > This also eliminates one more place where log recovery behavior can > > > differ from online behavior, because bunmapi operations no longer need > > > to requeue. > > > > > > Partial-revert-of: e1a4e37cc7b6 ("xfs: try to avoid blowing out the transaction reservation when bunmaping a shared extent") > > > Depends: 27dada070d59 ("xfs: change the order in which child and parent defer ops ar finished") > > > Depends: 74f4d6a1e065 ("xfs: only relog deferred intent items if free space in the log gets low") > > > Signed-off-by: Darrick J. Wong <djwong@xxxxxxxxxx> > > > --- > > > fs/xfs/libxfs/xfs_bmap.c | 22 +--------------------- > > > fs/xfs/libxfs/xfs_refcount.c | 5 ++--- > > > fs/xfs/libxfs/xfs_refcount.h | 8 ++------ > > > 3 files changed, 5 insertions(+), 30 deletions(-) > > > > This looks reasonable, but I'm wondering how the original problem > > was discovered and whether this has been tested against that > > original problem situation to ensure we aren't introducing a > > regression here.... > > generic/447, and yes, I have forced it to run a deletion of 1 million > extents without incident. :) Ok, that's all I wanted to know :) Reviewed-by: Dave Chinner <dchinner@xxxxxxxxxx> -- Dave Chinner david@xxxxxxxxxxxxx