On Wed, Jan 10, 2024 at 09:45:01AM -0800, Darrick J. Wong wrote: > On Wed, Jan 10, 2024 at 03:13:47PM +0800, Jian Wen wrote: > > From: Jian Wen <wenjianhn@xxxxxxxxx> > > > > Deleting a file with lots of extents may cause a soft lockup if the > > preemption model is none(CONFIG_PREEMPT_NONE=y or preempt=none is set > > in the kernel cmdline). Alibaba cloud kernel and Oracle UEK container > > kernel are affected by the issue, since they select CONFIG_PREEMPT_NONE=y. > > > > Explicitly call cond_resched in xfs_itruncate_extents_flags avoid > > the below softlockup warning. > > watchdog: BUG: soft lockup - CPU#0 stuck for 23s! [kworker/0:13:139] > > Wowwee, how many extents does this file have, anyway? IIRC, extent removal runs at about 50,000 extents/s on a typical modern CPU on a production kernel build when everything is cached. 23s then equates to a bit above a million extents.... > > diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c > > index c0f1c89786c2..194381e10472 100644 > > --- a/fs/xfs/xfs_inode.c > > +++ b/fs/xfs/xfs_inode.c > > @@ -4,6 +4,7 @@ > > * All Rights Reserved. > > */ > > #include <linux/iversion.h> > > +#include <linux/sched.h> > > > > #include "xfs.h" > > #include "xfs_fs.h" > > @@ -1383,6 +1384,8 @@ xfs_itruncate_extents_flags( > > error = xfs_defer_finish(&tp); > > if (error) > > goto out; > > + > > + cond_resched(); > > Is there a particular reason why truncation uses a giant chained > transaction for all extents (with xfs_defer_finish) instead of, say, one > chain per extent? IIUC the question correctly, extent removal for a given range ibeing truncated has to been seen externally as an atomic change so the ILOCK has to held across the entire removal operation. This means it has to be executed as a single rolling transaction as transaction reservation cannot be done with the ILOCK held and the atomicity of extent removal prevents the ILOCK from being dropped. This atomicity only really matters for active operations like truncate, hole punching, etc when other operations may also be trying to act on the extent list. However, in inodegc, we are essentially guaranteed that nobody else will need to access the extent list whilst we are truncating them away, so it could be done as a bunch of fine grained transactions. However, this doesn't solve the problem the CPU never being yeilded if everything is cached and no locks or transaction reservations are ever contended. It will still hold the CPU for the same length of time. Hence I'm not sure changing the way the transactions are run will really change anything material w.r.t. the issue at hand. -Dave. -- Dave Chinner david@xxxxxxxxxxxxx