[cc Thomas, lkml] On Thu, Jan 11, 2024 at 08:52:22PM +0800, Jian Wen wrote: > On Thu, Jan 11, 2024 at 5:38 AM Dave Chinner <david@xxxxxxxxxxxxx> wrote: > > > > On Wed, Jan 10, 2024 at 03:13:47PM +0800, Jian Wen wrote: > > > From: Jian Wen <wenjianhn@xxxxxxxxx> > > > > > > Deleting a file with lots of extents may cause a soft lockup if the > > > preemption model is none(CONFIG_PREEMPT_NONE=y or preempt=none is set > > > in the kernel cmdline). Alibaba cloud kernel and Oracle UEK container > > > kernel are affected by the issue, since they select CONFIG_PREEMPT_NONE=y. > > > > Time for them to move to CONFIG_PREEMPT_DYNAMIC? > I had asked one of them to support CONFIG_PREEMPT_DYNAMIC before > sending the patch. OK. > > Also there has been recent action towards removing > > CONFIG_PREEMPT_NONE/VOLUNTARY and cond_resched() altogether because > > the lazy preemption model coming present in the RTPREEMPT patchset > > solves the performance issues with full preemption that PREEMPT_NONE > > works around... > > > > https://lwn.net/Articles/944686/ > > https://lwn.net/Articles/945422/ > > > > Further, Thomas Gleixner has stated in those discussions that: > > > > "Though definitely I'm putting a permanent NAK in place for > > any attempts to duct tape the preempt=NONE model any > > further by sprinkling more cond*() and whatever warts > > around." > > > > https://lwn.net/ml/linux-kernel/87jzshhexi.ffs@tglx/ > > > > > Explicitly call cond_resched in xfs_itruncate_extents_flags avoid > > > the below softlockup warning. > > > > IOWs, this is no longer considered an acceptible solution by core > > kernel maintainers. > Understood. I will only build a hotfix for our production kernel then. Yeah, that may be your best short term fix. We'll need to clarify what the current policy is on adding cond_resched points before we go any further in this direction. Thomas, any update on what is happening with cond_resched() - is there an ETA on it going away/being unnecessary? > > Regardless of these policy issues, the code change: > > > > > diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c > > > index c0f1c89786c2..194381e10472 100644 > > > --- a/fs/xfs/xfs_inode.c > > > +++ b/fs/xfs/xfs_inode.c > > > @@ -4,6 +4,7 @@ > > > * All Rights Reserved. > > > */ > > > #include <linux/iversion.h> > > > +#include <linux/sched.h> > > > > Global includes like this go in fs/xfs/xfs_linux.h, but I don't > > think that's even necessary because we have cond_resched() calls > > elsewhere in XFS with the same include list as xfs_inode.c... > > > > > #include "xfs.h" > > > #include "xfs_fs.h" > > > @@ -1383,6 +1384,8 @@ xfs_itruncate_extents_flags( > > > error = xfs_defer_finish(&tp); > > > if (error) > > > goto out; > > > + > > > + cond_resched(); > > > } > > > > Shouldn't this go in xfs_defer_finish() so that we capture all the > > cases where we loop indefinitely over a range continually rolling a > > permanent transaction via xfs_defer_finish()? > It seems xfs_collapse_file_space and xfs_insert_file_space also need > to yield CPU. > I don't have use cases for them yet. Yup, they do, but they also call xfs_defer_finish(), so having the cond_resched() in that function will capture them as well. Also, the current upstream tree has moved this code from xfs_itruncate_extents_flags() to xfs_bunmapi_range(), so the cond_resched() has to be moved, anyway. We may as well put it in xfs_defer_finish() if we end up doing this. Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx