Re: [PATCH 18/30] xfs: remove IO submission from xfs_reclaim_inode()

Brian Foster <bfoster@xxxxxxxxxx> · Fri, 5 Jun 2020 12:25:44 -0400

On Fri, Jun 05, 2020 at 08:53:42AM +1000, Dave Chinner wrote:
> On Thu, Jun 04, 2020 at 02:08:14PM -0400, Brian Foster wrote:
> > On Thu, Jun 04, 2020 at 05:45:54PM +1000, Dave Chinner wrote:
> > > From: Dave Chinner <dchinner@xxxxxxxxxx>
> > > 
> > > We no longer need to issue IO from shrinker based inode reclaim to
> > > prevent spurious OOM killer invocation. This leaves only the global
> > > filesystem management operations such as unmount needing to
> > > writeback dirty inodes and reclaim them.
> > > 
> > > Instead of using the reclaim pass to write dirty inodes before
> > > reclaiming them, use the AIL to push all the dirty inodes before we
> > > try to reclaim them. This allows us to remove all the conditional
> > > SYNC_WAIT locking and the writeback code from xfs_reclaim_inode()
> > > and greatly simplify the checks we need to do to reclaim an inode.
> > > 
> > > Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx>
> > > Reviewed-by: Darrick J. Wong <darrick.wong@xxxxxxxxxx>
> > > ---
> > >  fs/xfs/xfs_icache.c | 117 ++++++++++++--------------------------------
> > >  1 file changed, 31 insertions(+), 86 deletions(-)
> > > 
> > > diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
> > > index a6780942034fc..74032316ce5cc 100644
> > > --- a/fs/xfs/xfs_icache.c
> > > +++ b/fs/xfs/xfs_icache.c
> > ...
> > > @@ -1341,9 +1288,8 @@ xfs_reclaim_inodes_ag(
> > >  			for (i = 0; i < nr_found; i++) {
> > >  				if (!batch[i])
> > >  					continue;
> > > -				error = xfs_reclaim_inode(batch[i], pag, flags);
> > > -				if (error && last_error != -EFSCORRUPTED)
> > > -					last_error = error;
> > > +				if (!xfs_reclaim_inode(batch[i], pag, flags))
> > > +					skipped++;
> > 
> > Just a note that I find it a little bit of a landmine that skipped is
> > bumped on trylock failure of the perag reclaim lock when the
> > xfs_reclaim_inodes() caller can now spin on that.
> 
> Intentional, because without bumping skipped on perag reclaim lock
> failure we can silently skip entire AGs when doing blocking reclaim
> and xfs_reclaim_inodes() fails to reclaim all inodes in the cache.
> 
> It's only necessary to work around fatal bugs this patch exposes
> for the brief period that this infrastructure is being torn down by
> this patchset....
> 
> > It doesn't appear to
> > be an issue with current users, though (xfs_reclaim_workers() passes
> > SYNC_TRYLOCK but not SYNC_WAIT).
> 
> xfs_reclaim_workers() is optimisitic, background reclaim, so we
> just don't care if it skips over things. We just don't want it to
> block.
> 

Sure, I'm just warning that this puts in place infrastructure that is
easily possible to misuse in a way that could lead to livelocks. If it's
torn done by the end of the series then it's not a problem. If not, we
should probably consider hardening it somehow or another after the
series so we don't leave ourselves a landmine to step on.

Brian

> > >  			}
> > >  
> > >  			*nr_to_scan -= XFS_LOOKUP_BATCH;
> > ...
> > > @@ -1380,8 +1314,18 @@ xfs_reclaim_inodes(
> > >  	int		mode)
> > >  {
> > >  	int		nr_to_scan = INT_MAX;
> > > +	int		skipped;
> > >  
> > > -	return xfs_reclaim_inodes_ag(mp, mode, &nr_to_scan);
> > > +	xfs_reclaim_inodes_ag(mp, mode, &nr_to_scan);
> > > +	if (!(mode & SYNC_WAIT))
> > > +		return 0;
> > > +
> > 
> > Any reason we fall into the loop below if SYNC_WAIT was passed but the
> > above xfs_reclaim_inodes_ag() call would have returned 0?
> 
> Same thing again. It's temporary to maintain correctness while one
> thing at time is removed from the reclaim code. This code goes away
> in the same patch that makes SYNC_WAIT go away.
> 
> > Looks reasonable other than that inefficiency:
> > 
> > Reviewed-by: Brian Foster <bfoster@xxxxxxxxxx>
> 
> Thanks!
> 
> -Dave.
> -- 
> Dave Chinner
> david@xxxxxxxxxxxxx
>