On Wed, 2010-09-22 at 16:44 +1000, Dave Chinner wrote: > From: Dave Chinner <dchinner@xxxxxxxxxx> > > Memory reclaim via shrinkers has a terrible habit of having N+M > concurrent shrinker executions (N = num CPUs, M = num kswapds) all > trying to shrink the same cache. When the cache they are all working > on is protected by a single spinlock, massive contention an > slowdowns occur. > > Wrap the per-ag inode caches with a reclaim mutex to serialise > reclaim access to the AG. This will block concurrent reclaim in each > AG but still allow reclaim to scan multiple AGs concurrently. Allow > shrinkers to move on to the next AG if it can't get the lock, and if > we can't get any AG, then start blocking on locks. > > To prevent reclaimers from continually scanning the same inodes in > each AG, add a cursor that tracks where the last reclaim got up to > and start from that point on the next reclaim. This should avoid > only ever scanning a small number of inodes at the satart of each AG > and not making progress. If we have a non-shrinker based reclaim > pass, ignore the cursor and reset it to zero once we are done. > > Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx> One small comment about the implied meaning of "trylock" below. But not a big deal, so... Reviewed-by: Alex Elder <aelder@xxxxxxx> > --- > fs/xfs/linux-2.6/xfs_sync.c | 24 ++++++++++++++++++++++++ > fs/xfs/xfs_ag.h | 2 ++ > fs/xfs/xfs_mount.c | 1 + > 3 files changed, 27 insertions(+), 0 deletions(-) > > diff --git a/fs/xfs/linux-2.6/xfs_sync.c b/fs/xfs/linux-2.6/xfs_sync.c > index ea44b1d..7b06399 100644 > --- a/fs/xfs/linux-2.6/xfs_sync.c > +++ b/fs/xfs/linux-2.6/xfs_sync.c . . . > @@ -840,6 +842,17 @@ xfs_reclaim_inodes_ag( > > ag = pag->pag_agno + 1; > > + if (!mutex_trylock(&pag->pag_ici_reclaim_lock)) { > + if (trylock) { > + trylock++; > + continue; > + } > + mutex_lock(&pag->pag_ici_reclaim_lock); > + } > + It isn't all that obvious here that "trylock" also carries the meaning "called via the inode shrinker", which is why we're using the cursor in this case. > + if (trylock) > + first_index = pag->pag_ici_reclaim_cursor; > + > do { > struct xfs_inode *batch[XFS_LOOKUP_BATCH]; > int i; _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs