On Fri, Jun 05, 2020 at 08:53:42AM +1000, Dave Chinner wrote: > On Thu, Jun 04, 2020 at 02:08:14PM -0400, Brian Foster wrote: > > On Thu, Jun 04, 2020 at 05:45:54PM +1000, Dave Chinner wrote: > > > From: Dave Chinner <dchinner@xxxxxxxxxx> > > > > > > We no longer need to issue IO from shrinker based inode reclaim to > > > prevent spurious OOM killer invocation. This leaves only the global > > > filesystem management operations such as unmount needing to > > > writeback dirty inodes and reclaim them. > > > > > > Instead of using the reclaim pass to write dirty inodes before > > > reclaiming them, use the AIL to push all the dirty inodes before we > > > try to reclaim them. This allows us to remove all the conditional > > > SYNC_WAIT locking and the writeback code from xfs_reclaim_inode() > > > and greatly simplify the checks we need to do to reclaim an inode. > > > > > > Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx> > > > Reviewed-by: Darrick J. Wong <darrick.wong@xxxxxxxxxx> > > > --- > > > fs/xfs/xfs_icache.c | 117 ++++++++++++-------------------------------- > > > 1 file changed, 31 insertions(+), 86 deletions(-) > > > > > > diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c > > > index a6780942034fc..74032316ce5cc 100644 > > > --- a/fs/xfs/xfs_icache.c > > > +++ b/fs/xfs/xfs_icache.c > > ... > > > @@ -1341,9 +1288,8 @@ xfs_reclaim_inodes_ag( > > > for (i = 0; i < nr_found; i++) { > > > if (!batch[i]) > > > continue; > > > - error = xfs_reclaim_inode(batch[i], pag, flags); > > > - if (error && last_error != -EFSCORRUPTED) > > > - last_error = error; > > > + if (!xfs_reclaim_inode(batch[i], pag, flags)) > > > + skipped++; > > > > Just a note that I find it a little bit of a landmine that skipped is > > bumped on trylock failure of the perag reclaim lock when the > > xfs_reclaim_inodes() caller can now spin on that. > > Intentional, because without bumping skipped on perag reclaim lock > failure we can silently skip entire AGs when doing blocking reclaim > and xfs_reclaim_inodes() fails to reclaim all inodes in the cache. > > It's only necessary to work around fatal bugs this patch exposes > for the brief period that this infrastructure is being torn down by > this patchset.... > > > It doesn't appear to > > be an issue with current users, though (xfs_reclaim_workers() passes > > SYNC_TRYLOCK but not SYNC_WAIT). > > xfs_reclaim_workers() is optimisitic, background reclaim, so we > just don't care if it skips over things. We just don't want it to > block. > Sure, I'm just warning that this puts in place infrastructure that is easily possible to misuse in a way that could lead to livelocks. If it's torn done by the end of the series then it's not a problem. If not, we should probably consider hardening it somehow or another after the series so we don't leave ourselves a landmine to step on. Brian > > > } > > > > > > *nr_to_scan -= XFS_LOOKUP_BATCH; > > ... > > > @@ -1380,8 +1314,18 @@ xfs_reclaim_inodes( > > > int mode) > > > { > > > int nr_to_scan = INT_MAX; > > > + int skipped; > > > > > > - return xfs_reclaim_inodes_ag(mp, mode, &nr_to_scan); > > > + xfs_reclaim_inodes_ag(mp, mode, &nr_to_scan); > > > + if (!(mode & SYNC_WAIT)) > > > + return 0; > > > + > > > > Any reason we fall into the loop below if SYNC_WAIT was passed but the > > above xfs_reclaim_inodes_ag() call would have returned 0? > > Same thing again. It's temporary to maintain correctness while one > thing at time is removed from the reclaim code. This code goes away > in the same patch that makes SYNC_WAIT go away. > > > Looks reasonable other than that inefficiency: > > > > Reviewed-by: Brian Foster <bfoster@xxxxxxxxxx> > > Thanks! > > -Dave. > -- > Dave Chinner > david@xxxxxxxxxxxxx >