Hi Chandan, Can you please pull the changes to fstrim behaviour from the signed tag below? This has been rebased on 6.6-rc4 so should merge cleanly into a current tree. Thanks, Dave. ---------------------------------------------------------------- The following changes since commit 8a749fd1a8720d4619c91c8b6e7528c0a355c0aa: Linux 6.6-rc4 (2023-10-01 14:15:13 -0700) are available in the Git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs tags/xfs-fstrim-busy-tag for you to fetch changes up to e78a40b851712b422d7d4ae345f25511d47a9a38: xfs: abort fstrim if kernel is suspending (2023-10-04 09:25:04 +1100) ---------------------------------------------------------------- xfs: reduce AGF hold times during fstrim operations A recent log space overflow and recovery failure was root caused to a long running truncate blocking on the AGF and ending up pinning the tail of the log. The filesystem then hung, the machine was rebooted, and log recoery then refused to run because there wasn't enough space in the log for EFI transaction reservation. The reason the long running truncate got blocked on the AGF for so long was that an fstrim was being run. THe underlying block device was large and very slow (10TB ceph rbd volume) and so discarding all the free space in the AG took a really long time. The current fstrim implementation holds the AGF across the entire operations - both the freee space scan and the issuing of all the discards. The discards are synchronous and single depth, so if there are millions of free spaces, we hold the AGF lock across millions of discard operations. It doesn't really need to be said that this is a Bad Thing. This series reworks the fstrim discard path to use the same mechanisms as online discard. This allows discards to be issued asynchronously without holding the AGF locked, enabling higher discard queue depths (much faster on fast devices) and only requiring the AGF lock to be held whilst we are scanning free space. To do this, we make use of busy extents - we lock the AGF, mark all the extents we want to discard as "busy under discard" so that nothing will be allowed to allocate them, and then drop the AGF lock. We then issue discards on the gathered busy extents and on discard completion remove them from the busy list. This results in AGF lock holds times for fstrim dropping to a few milliseconds each batch of free extents we scan, and so the hours long hold times that can currently occur on large, slow, badly fragmented device no longer occur. Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx> ---------------------------------------------------------------- Dave Chinner (3): xfs: move log discard work to xfs_discard.c xfs: reduce AGF hold times during fstrim operations xfs: abort fstrim if kernel is suspending fs/xfs/xfs_discard.c | 266 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++------- fs/xfs/xfs_discard.h | 6 +- fs/xfs/xfs_extent_busy.c | 34 ++++++++-- fs/xfs/xfs_extent_busy.h | 24 ++++++- fs/xfs/xfs_log_cil.c | 93 ++++----------------------- fs/xfs/xfs_log_priv.h | 5 +- 6 files changed, 311 insertions(+), 117 deletions(-) -- Dave Chinner david@xxxxxxxxxxxxx