Re: [PATCH 18/30] xfs: remove IO submission from xfs_reclaim_inode()

Dave Chinner <david@xxxxxxxxxxxxx> · Fri, 5 Jun 2020 08:53:42 +1000

On Thu, Jun 04, 2020 at 02:08:14PM -0400, Brian Foster wrote:
> On Thu, Jun 04, 2020 at 05:45:54PM +1000, Dave Chinner wrote:
> > From: Dave Chinner <dchinner@xxxxxxxxxx>
> > 
> > We no longer need to issue IO from shrinker based inode reclaim to
> > prevent spurious OOM killer invocation. This leaves only the global
> > filesystem management operations such as unmount needing to
> > writeback dirty inodes and reclaim them.
> > 
> > Instead of using the reclaim pass to write dirty inodes before
> > reclaiming them, use the AIL to push all the dirty inodes before we
> > try to reclaim them. This allows us to remove all the conditional
> > SYNC_WAIT locking and the writeback code from xfs_reclaim_inode()
> > and greatly simplify the checks we need to do to reclaim an inode.
> > 
> > Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx>
> > Reviewed-by: Darrick J. Wong <darrick.wong@xxxxxxxxxx>
> > ---
> >  fs/xfs/xfs_icache.c | 117 ++++++++++++--------------------------------
> >  1 file changed, 31 insertions(+), 86 deletions(-)
> > 
> > diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
> > index a6780942034fc..74032316ce5cc 100644
> > --- a/fs/xfs/xfs_icache.c
> > +++ b/fs/xfs/xfs_icache.c
> ...
> > @@ -1341,9 +1288,8 @@ xfs_reclaim_inodes_ag(
> >  			for (i = 0; i < nr_found; i++) {
> >  				if (!batch[i])
> >  					continue;
> > -				error = xfs_reclaim_inode(batch[i], pag, flags);
> > -				if (error && last_error != -EFSCORRUPTED)
> > -					last_error = error;
> > +				if (!xfs_reclaim_inode(batch[i], pag, flags))
> > +					skipped++;
> 
> Just a note that I find it a little bit of a landmine that skipped is
> bumped on trylock failure of the perag reclaim lock when the
> xfs_reclaim_inodes() caller can now spin on that.

Intentional, because without bumping skipped on perag reclaim lock
failure we can silently skip entire AGs when doing blocking reclaim
and xfs_reclaim_inodes() fails to reclaim all inodes in the cache.

It's only necessary to work around fatal bugs this patch exposes
for the brief period that this infrastructure is being torn down by
this patchset....

> It doesn't appear to
> be an issue with current users, though (xfs_reclaim_workers() passes
> SYNC_TRYLOCK but not SYNC_WAIT).

xfs_reclaim_workers() is optimisitic, background reclaim, so we
just don't care if it skips over things. We just don't want it to
block.

> >  			}
> >  
> >  			*nr_to_scan -= XFS_LOOKUP_BATCH;
> ...
> > @@ -1380,8 +1314,18 @@ xfs_reclaim_inodes(
> >  	int		mode)
> >  {
> >  	int		nr_to_scan = INT_MAX;
> > +	int		skipped;
> >  
> > -	return xfs_reclaim_inodes_ag(mp, mode, &nr_to_scan);
> > +	xfs_reclaim_inodes_ag(mp, mode, &nr_to_scan);
> > +	if (!(mode & SYNC_WAIT))
> > +		return 0;
> > +
> 
> Any reason we fall into the loop below if SYNC_WAIT was passed but the
> above xfs_reclaim_inodes_ag() call would have returned 0?

Same thing again. It's temporary to maintain correctness while one
thing at time is removed from the reclaim code. This code goes away
in the same patch that makes SYNC_WAIT go away.

> Looks reasonable other than that inefficiency:
> 
> Reviewed-by: Brian Foster <bfoster@xxxxxxxxxx>

Thanks!

-Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx