Re: [PATCH 7/9] xfs: ignore stale buffers when scanning the buffer cache

Dave Chinner <david@xxxxxxxxxxxxx> · Tue, 20 Jun 2023 13:24:10 +1000

On Thu, May 25, 2023 at 05:44:48PM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@xxxxxxxxxx>
> 
> After an online repair, we need to invalidate buffers representing the
> blocks from the old metadata that we're replacing.  It's possible that
> parts of a tree that were previously cached in memory are no longer
> accessible due to media failure or other corruption on interior nodes,
> so repair figures out the old blocks from the reverse mapping data and
> scans the buffer cache directly.
> 
> Unfortunately, the current buffer cache code triggers asserts if the
> rhashtable lookup finds a non-stale buffer of a different length than
> the key we searched for.  For regular operation this is desirable, but
> for this repair procedure, we don't care since we're going to forcibly
> stale the buffer anyway.  Add an internal lookup flag to avoid the
> assert.
> 
> Signed-off-by: Darrick J. Wong <djwong@xxxxxxxxxx>
> ---
>  fs/xfs/scrub/reap.c |    2 +-
>  fs/xfs/xfs_buf.c    |    5 ++++-
>  fs/xfs/xfs_buf.h    |   10 ++++++++++
>  3 files changed, 15 insertions(+), 2 deletions(-)
> 
> 
> diff --git a/fs/xfs/scrub/reap.c b/fs/xfs/scrub/reap.c
> index 30e61315feb0..ca75c22481d2 100644
> --- a/fs/xfs/scrub/reap.c
> +++ b/fs/xfs/scrub/reap.c
> @@ -149,7 +149,7 @@ xrep_block_reap_binval(
>  	 */
>  	error = xfs_buf_incore(sc->mp->m_ddev_targp,
>  			XFS_FSB_TO_DADDR(sc->mp, fsbno),
> -			XFS_FSB_TO_BB(sc->mp, 1), 0, &bp);
> +			XFS_FSB_TO_BB(sc->mp, 1), XBF_BCACHE_SCAN, &bp);

Can't say I'm a big fan of XBF_BCACHE_SCAN as a name - it tells me
nothing about what the incore lookup is actually doing. The actual
lookup action that is being performed is "find any match" rather
than "find exact match". XBF_ANY_MATCH would be a better name, IMO.

>  	if (error)
>  		return;
>  
> diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c
> index 15d1e5a7c2d3..b31e6d09a056 100644
> --- a/fs/xfs/xfs_buf.c
> +++ b/fs/xfs/xfs_buf.c
> @@ -481,7 +481,8 @@ _xfs_buf_obj_cmp(
>  		 * reallocating a busy extent. Skip this buffer and
>  		 * continue searching for an exact match.
>  		 */
> -		ASSERT(bp->b_flags & XBF_STALE);
> +		if (!(map->bm_flags & XBM_IGNORE_LENGTH_MISMATCH))
> +			ASSERT(bp->b_flags & XBF_STALE);

And this becomes XBM_ANY_MATCH, too.

>  		return 1;
>  	}
>  	return 0;
> @@ -682,6 +683,8 @@ xfs_buf_get_map(
>  	int			error;
>  	int			i;
>  
> +	if (flags & XBF_BCACHE_SCAN)
> +		cmap.bm_flags |= XBM_IGNORE_LENGTH_MISMATCH;
>  	for (i = 0; i < nmaps; i++)
>  		cmap.bm_len += map[i].bm_len;
>  
> diff --git a/fs/xfs/xfs_buf.h b/fs/xfs/xfs_buf.h
> index 549c60942208..d6e8c3bab9f6 100644
> --- a/fs/xfs/xfs_buf.h
> +++ b/fs/xfs/xfs_buf.h
> @@ -44,6 +44,11 @@ struct xfs_buf;
>  #define _XBF_DELWRI_Q	 (1u << 22)/* buffer on a delwri queue */
>  
>  /* flags used only as arguments to access routines */
> +/*
> + * We're scanning the buffer cache; do not warn about lookup mismatches.

The code using the flag isn't doing this - it's trying to invalidate
any existing buffer at the location given. It simply wants any
buffer at that address to be returned...

> + * Only online repair should use this.
> + */
> +#define XBF_BCACHE_SCAN	 (1u << 28)
>  #define XBF_INCORE	 (1u << 29)/* lookup only, return if found in cache */
>  #define XBF_TRYLOCK	 (1u << 30)/* lock requested, but do not wait */
>  #define XBF_UNMAPPED	 (1u << 31)/* do not map the buffer */
> @@ -67,6 +72,7 @@ typedef unsigned int xfs_buf_flags_t;
>  	{ _XBF_KMEM,		"KMEM" }, \
>  	{ _XBF_DELWRI_Q,	"DELWRI_Q" }, \
>  	/* The following interface flags should never be set */ \
> +	{ XBF_BCACHE_SCAN,	"BCACHE_SCAN" }, \
>  	{ XBF_INCORE,		"INCORE" }, \
>  	{ XBF_TRYLOCK,		"TRYLOCK" }, \
>  	{ XBF_UNMAPPED,		"UNMAPPED" }
> @@ -114,8 +120,12 @@ typedef struct xfs_buftarg {
>  struct xfs_buf_map {
>  	xfs_daddr_t		bm_bn;	/* block number for I/O */
>  	int			bm_len;	/* size of I/O */
> +	unsigned int		bm_flags;
>  };
>  
> +/* Don't complain about live buffers with the wrong length during lookup. */
> +#define XBM_IGNORE_LENGTH_MISMATCH	(1U << 0)

Which makes me wonder now: can we have two cached buffers at the
same address with different lengths during a repair?

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx