On Wed, Mar 10, 2021 at 07:06:41PM -0800, Darrick J. Wong wrote: > From: Darrick J. Wong <djwong@xxxxxxxxxx> > > Create a polled version of xfs_inactive_force so that we can force > inactivation while holding a lock (usually the umount lock) without > tripping over the softlockup timer. This is for callers that hold vfs > locks while calling inactivation, which is currently unmount, iunlink > processing during mount, and rw->ro remount. > > Signed-off-by: Darrick J. Wong <djwong@xxxxxxxxxx> > --- > fs/xfs/xfs_icache.c | 38 +++++++++++++++++++++++++++++++++++++- > fs/xfs/xfs_icache.h | 1 + > fs/xfs/xfs_mount.c | 2 +- > fs/xfs/xfs_mount.h | 5 +++++ > fs/xfs/xfs_super.c | 3 ++- > 5 files changed, 46 insertions(+), 3 deletions(-) > > > diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c > index d5f580b92e48..9db2beb4e732 100644 > --- a/fs/xfs/xfs_icache.c > +++ b/fs/xfs/xfs_icache.c > @@ -25,6 +25,7 @@ > #include "xfs_ialloc.h" > > #include <linux/iversion.h> > +#include <linux/nmi.h> This stuff goes in fs/xfs/xfs_linux.h, not here. > > /* > * Allocate and initialise an xfs_inode. > @@ -2067,8 +2068,12 @@ xfs_inodegc_free_space( > struct xfs_mount *mp, > struct xfs_eofblocks *eofb) > { > - return xfs_inode_walk(mp, XFS_INODE_WALK_INACTIVE, > + int error; > + > + error = xfs_inode_walk(mp, XFS_INODE_WALK_INACTIVE, > xfs_inactive_inode, eofb, XFS_ICI_INACTIVE_TAG); > + wake_up(&mp->m_inactive_wait); > + return error; > } > > /* Try to get inode inactivation moving. */ > @@ -2138,6 +2143,37 @@ xfs_inodegc_force( > flush_workqueue(mp->m_gc_workqueue); > } > > +/* > + * Force all inode inactivation work to run immediately, and poll until the > + * work is complete. Callers should only use this function if they must > + * inactivate inodes while holding VFS locks, and must be prepared to prevent > + * or to wait for inodes that are queued for inactivation while this runs. > + */ > +void > +xfs_inodegc_force_poll( > + struct xfs_mount *mp) > +{ > + struct xfs_perag *pag; > + xfs_agnumber_t agno; > + bool queued = false; > + > + for_each_perag_tag(mp, agno, pag, XFS_ICI_INACTIVE_TAG) > + queued |= xfs_inodegc_force_pag(pag); > + if (!queued) > + return; > + > + /* > + * Touch the softlockup watchdog every 1/10th of a second while there > + * are still inactivation-tagged inodes in the filesystem. > + */ > + while (!wait_event_timeout(mp->m_inactive_wait, > + !radix_tree_tagged(&mp->m_perag_tree, > + XFS_ICI_INACTIVE_TAG), > + HZ / 10)) { > + touch_softlockup_watchdog(); > + } > +} This looks like a deadlock waiting to be tripped over. As long as there is something still able to queue inodes for inactivation, that radix tree tag check will always trigger and put us back to sleep. Also, in terms of workqueues, this is a "sync flush" i because we are waiting for it. e.g. the difference between cancel_work() and cancel_work_sync() is that the later waits for all the work in progress to complete before returning and the former doesn't wait... Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx