Re: [PATCH 2/2] xfs: reduce free inode accounting overhead

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, May 20, 2020 at 07:48:40AM +1000, Dave Chinner wrote:
> From: Dave Chinner <dchinner@xxxxxxxxxx>
> 
> Shaokun Zhang reported that XFs was using substantial CPU time in
> percpu_count_sum() when running a single threaded benchmark on
> a high CPU count (128p) machine from xfs_mod_ifree(). The issue
> is that the filesystem is empty when the benchmark runs, so inode
> allocation is running with a very low inode free count.
> 
> With the percpu counter batching, this means comparisons when the
> counter is less that 128 * 256 = 32768 use the slow path of adding
> up all the counters across the CPUs, and this is expensive on high
> CPU count machines.
> 
> The summing in xfs_mod_ifree() is only used to fire an assert if an
> underrun occurs. The error is ignored by the higher level code.
> Hence this is really just debug code and we don't need to run it
> on production kernels, nor do we need such debug checks to return
> error values just to trigger an assert.
> 
> Finally, xfs_mod_icount/xfs_mod_ifree are only called from
> xfs_trans_unreserve_and_mod_sb(), so get rid of them and just
> directly call the percpu_counter_add/percpu_counter_compare
> functions. The compare functions are now run only on debug builds as
> they are internal to ASSERT() checks and so only compiled in when
> ASSERTs are active (CONFIG_XFS_DEBUG=y or CONFIG_XFS_WARN=y).
> 
> Reported-by: Shaokun Zhang <zhangshaokun@xxxxxxxxxxxxx>
> Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx>

Seems like a reasonable substitution/ASSERT reduction to me,
Reviewed-by: Darrick J. Wong <darrick.wong@xxxxxxxxxx>

--D

> ---
>  fs/xfs/xfs_mount.c | 33 ---------------------------------
>  fs/xfs/xfs_mount.h |  2 --
>  fs/xfs/xfs_trans.c | 17 +++++++++++++----
>  3 files changed, 13 insertions(+), 39 deletions(-)
> 
> diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c
> index bb91f04266b9a..d5dcf98698600 100644
> --- a/fs/xfs/xfs_mount.c
> +++ b/fs/xfs/xfs_mount.c
> @@ -1189,39 +1189,6 @@ xfs_log_sbcount(xfs_mount_t *mp)
>  	return xfs_sync_sb(mp, true);
>  }
>  
> -/*
> - * Deltas for the inode count are +/-64, hence we use a large batch size
> - * of 128 so we don't need to take the counter lock on every update.
> - */
> -#define XFS_ICOUNT_BATCH	128
> -int
> -xfs_mod_icount(
> -	struct xfs_mount	*mp,
> -	int64_t			delta)
> -{
> -	percpu_counter_add_batch(&mp->m_icount, delta, XFS_ICOUNT_BATCH);
> -	if (__percpu_counter_compare(&mp->m_icount, 0, XFS_ICOUNT_BATCH) < 0) {
> -		ASSERT(0);
> -		percpu_counter_add(&mp->m_icount, -delta);
> -		return -EINVAL;
> -	}
> -	return 0;
> -}
> -
> -int
> -xfs_mod_ifree(
> -	struct xfs_mount	*mp,
> -	int64_t			delta)
> -{
> -	percpu_counter_add(&mp->m_ifree, delta);
> -	if (percpu_counter_compare(&mp->m_ifree, 0) < 0) {
> -		ASSERT(0);
> -		percpu_counter_add(&mp->m_ifree, -delta);
> -		return -EINVAL;
> -	}
> -	return 0;
> -}
> -
>  /*
>   * Deltas for the block count can vary from 1 to very large, but lock contention
>   * only occurs on frequent small block count updates such as in the delayed
> diff --git a/fs/xfs/xfs_mount.h b/fs/xfs/xfs_mount.h
> index aba5a15792792..4835581f3eb00 100644
> --- a/fs/xfs/xfs_mount.h
> +++ b/fs/xfs/xfs_mount.h
> @@ -392,8 +392,6 @@ extern int	xfs_initialize_perag(xfs_mount_t *mp, xfs_agnumber_t agcount,
>  				     xfs_agnumber_t *maxagi);
>  extern void	xfs_unmountfs(xfs_mount_t *);
>  
> -extern int	xfs_mod_icount(struct xfs_mount *mp, int64_t delta);
> -extern int	xfs_mod_ifree(struct xfs_mount *mp, int64_t delta);
>  extern int	xfs_mod_fdblocks(struct xfs_mount *mp, int64_t delta,
>  				 bool reserved);
>  extern int	xfs_mod_frextents(struct xfs_mount *mp, int64_t delta);
> diff --git a/fs/xfs/xfs_trans.c b/fs/xfs/xfs_trans.c
> index 4522ceaaf57ba..b055a5ab53465 100644
> --- a/fs/xfs/xfs_trans.c
> +++ b/fs/xfs/xfs_trans.c
> @@ -545,7 +545,12 @@ xfs_trans_apply_sb_deltas(
>   * used block counts are not updated in the on disk superblock. In this case,
>   * XFS_TRANS_SB_DIRTY will not be set when the transaction is updated but we
>   * still need to update the incore superblock with the changes.
> + *
> + * Deltas for the inode count are +/-64, hence we use a large batch size of 128
> + * so we don't need to take the counter lock on every update.
>   */
> +#define XFS_ICOUNT_BATCH	128
> +
>  void
>  xfs_trans_unreserve_and_mod_sb(
>  	struct xfs_trans	*tp)
> @@ -585,13 +590,17 @@ xfs_trans_unreserve_and_mod_sb(
>  	}
>  
>  	if (idelta) {
> -		error = xfs_mod_icount(mp, idelta);
> -		ASSERT(!error);
> +		percpu_counter_add_batch(&mp->m_icount, idelta,
> +					 XFS_ICOUNT_BATCH);
> +		if (idelta < 0)
> +			ASSERT(__percpu_counter_compare(&mp->m_icount, 0,
> +							XFS_ICOUNT_BATCH) >= 0);
>  	}
>  
>  	if (ifreedelta) {
> -		error = xfs_mod_ifree(mp, ifreedelta);
> -		ASSERT(!error);
> +		percpu_counter_add(&mp->m_ifree, ifreedelta);
> +		if (ifreedelta < 0)
> +			ASSERT(percpu_counter_compare(&mp->m_ifree, 0) >= 0);
>  	}
>  
>  	if (rtxdelta == 0 && !(tp->t_flags & XFS_TRANS_SB_DIRTY))
> -- 
> 2.26.2.761.g0e0b3e54be
> 



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux