Re: [PATCH 1/5] xfs: track AGs with reclaimable inodes in per-ag radix tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 2010-07-15 at 10:38 +1000, Dave Chinner wrote:
> From: Dave Chinner <dchinner@xxxxxxxxxx>
> 
> https://bugzilla.kernel.org/show_bug.cgi?id=16348
> 
> When the filesystem grows to a large number of allocation groups,
> the summing of recalimable inodes gets expensive. In many cases,
> most AGs won't have any reclaimable inodes and so we are wasting CPU
> time aggregating over these AGs. This is particularly important for
> the inode shrinker that gets called frequently under memory
> pressure.
> 
> To avoid the overhead, track AGs with reclaimable inodes in the
> per-ag radix tree so that we can find all the AGs with reclaimable
> inodes via a simple gang tag lookup. This involves setting the tag
> when the first reclaimable inode is tracked in the AG, and removing
> the tag when the last reclaimable inode is removed from the tree.
> Then the summation process becomes a loop walking the radix tree
> summing AGs with the reclaim tag set.
> 
> This significantly reduces the overhead of scanning - a 6400 AG
> filesystea now only uses about 25% of a cpu in kswapd while slab reclaim
> progresses instead of being permanently stuck at 100% CPU and making little
> progress. Clean filesystems filesystems will see no overhead and the
> overhead only increases linearly with the number of dirty AGs.

Using the same tag represent a perag with reclaimable
inodes as the tag representing an inode is reclaimable
is nicely consistent...

I have a few comments below for your consideration
but they are minor (and don't even require a response).

Overall this looks good.

Reviewed-by: Alex Elder <aelder@xxxxxxx>

> Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx>
> ---
>  fs/xfs/linux-2.6/xfs_sync.c  |   71 +++++++++++++++++++++++++++++++++++++----
>  fs/xfs/linux-2.6/xfs_trace.h |    3 ++
>  2 files changed, 67 insertions(+), 7 deletions(-)
> 
> diff --git a/fs/xfs/linux-2.6/xfs_sync.c b/fs/xfs/linux-2.6/xfs_sync.c
> index 56fed91..51da595 100644
> --- a/fs/xfs/linux-2.6/xfs_sync.c
> +++ b/fs/xfs/linux-2.6/xfs_sync.c
> @@ -133,6 +133,41 @@ restart:
>  	return last_error;
>  }
>  
> +/*
> + * Select the next per-ag structure to iterate during the walk. The reclaim
> + * walk is optimised only to walk AGs with reclaimable inodes in them.
> + */
> +static struct xfs_perag *
> +xfs_inode_ag_iter_next_pag(
> +	struct xfs_mount	*mp,
> +	xfs_agnumber_t		*first,
> +	int			tag)
> +{
> +	struct xfs_perag	*pag = NULL;
> +
> +	if (tag == XFS_ICI_RECLAIM_TAG) {
> +		int found;
> +		int ref;
> +
> +		spin_lock(&mp->m_perag_lock);
> +		found = radix_tree_gang_lookup_tag(&mp->m_perag_tree,
> +				(void **)&pag, *first, 1, tag);
> +		if (found <= 0) {
> +			spin_unlock(&mp->m_perag_lock);
> +			return NULL;
> +		}
> +		*first = pag->pag_agno + 1;

Maybe move this below, just before the return.  I.e.:
	if (pag)
		*first = pag->pag_agno + 1;

To me it's slightly clearer that we're just setting
up to search next time with the perag following the
one returned.

> +		/* open coded pag reference increment */

By open-coding here you miss the assertions in xfs_perag_get().
(Though a common inline encapsulating them would have to be
in a header because the two functions reside in different files.)

> +		ref = atomic_inc_return(&pag->pag_ref);
> +		spin_unlock(&mp->m_perag_lock);
> +		trace_xfs_perag_get_reclaim(mp, pag->pag_agno, ref, _RET_IP_);
> +	} else {
> +		pag = xfs_perag_get(mp, *first);
> +		(*first)++;
> +	}
> +	return pag;
> +}
> +
>  int
>  xfs_inode_ag_iterator(
>  	struct xfs_mount	*mp,

. . .

_______________________________________________
xfs mailing list
xfs@xxxxxxxxxxx
http://oss.sgi.com/mailman/listinfo/xfs


[Index of Archives]     [Linux XFS Devel]     [Linux Filesystem Development]     [Filesystem Testing]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux