On Thu, May 25, 2023 at 05:44:48PM -0700, Darrick J. Wong wrote: > From: Darrick J. Wong <djwong@xxxxxxxxxx> > > After an online repair, we need to invalidate buffers representing the > blocks from the old metadata that we're replacing. It's possible that > parts of a tree that were previously cached in memory are no longer > accessible due to media failure or other corruption on interior nodes, > so repair figures out the old blocks from the reverse mapping data and > scans the buffer cache directly. > > Unfortunately, the current buffer cache code triggers asserts if the > rhashtable lookup finds a non-stale buffer of a different length than > the key we searched for. For regular operation this is desirable, but > for this repair procedure, we don't care since we're going to forcibly > stale the buffer anyway. Add an internal lookup flag to avoid the > assert. > > Signed-off-by: Darrick J. Wong <djwong@xxxxxxxxxx> > --- > fs/xfs/scrub/reap.c | 2 +- > fs/xfs/xfs_buf.c | 5 ++++- > fs/xfs/xfs_buf.h | 10 ++++++++++ > 3 files changed, 15 insertions(+), 2 deletions(-) > > > diff --git a/fs/xfs/scrub/reap.c b/fs/xfs/scrub/reap.c > index 30e61315feb0..ca75c22481d2 100644 > --- a/fs/xfs/scrub/reap.c > +++ b/fs/xfs/scrub/reap.c > @@ -149,7 +149,7 @@ xrep_block_reap_binval( > */ > error = xfs_buf_incore(sc->mp->m_ddev_targp, > XFS_FSB_TO_DADDR(sc->mp, fsbno), > - XFS_FSB_TO_BB(sc->mp, 1), 0, &bp); > + XFS_FSB_TO_BB(sc->mp, 1), XBF_BCACHE_SCAN, &bp); Can't say I'm a big fan of XBF_BCACHE_SCAN as a name - it tells me nothing about what the incore lookup is actually doing. The actual lookup action that is being performed is "find any match" rather than "find exact match". XBF_ANY_MATCH would be a better name, IMO. > if (error) > return; > > diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c > index 15d1e5a7c2d3..b31e6d09a056 100644 > --- a/fs/xfs/xfs_buf.c > +++ b/fs/xfs/xfs_buf.c > @@ -481,7 +481,8 @@ _xfs_buf_obj_cmp( > * reallocating a busy extent. Skip this buffer and > * continue searching for an exact match. > */ > - ASSERT(bp->b_flags & XBF_STALE); > + if (!(map->bm_flags & XBM_IGNORE_LENGTH_MISMATCH)) > + ASSERT(bp->b_flags & XBF_STALE); And this becomes XBM_ANY_MATCH, too. > return 1; > } > return 0; > @@ -682,6 +683,8 @@ xfs_buf_get_map( > int error; > int i; > > + if (flags & XBF_BCACHE_SCAN) > + cmap.bm_flags |= XBM_IGNORE_LENGTH_MISMATCH; > for (i = 0; i < nmaps; i++) > cmap.bm_len += map[i].bm_len; > > diff --git a/fs/xfs/xfs_buf.h b/fs/xfs/xfs_buf.h > index 549c60942208..d6e8c3bab9f6 100644 > --- a/fs/xfs/xfs_buf.h > +++ b/fs/xfs/xfs_buf.h > @@ -44,6 +44,11 @@ struct xfs_buf; > #define _XBF_DELWRI_Q (1u << 22)/* buffer on a delwri queue */ > > /* flags used only as arguments to access routines */ > +/* > + * We're scanning the buffer cache; do not warn about lookup mismatches. The code using the flag isn't doing this - it's trying to invalidate any existing buffer at the location given. It simply wants any buffer at that address to be returned... > + * Only online repair should use this. > + */ > +#define XBF_BCACHE_SCAN (1u << 28) > #define XBF_INCORE (1u << 29)/* lookup only, return if found in cache */ > #define XBF_TRYLOCK (1u << 30)/* lock requested, but do not wait */ > #define XBF_UNMAPPED (1u << 31)/* do not map the buffer */ > @@ -67,6 +72,7 @@ typedef unsigned int xfs_buf_flags_t; > { _XBF_KMEM, "KMEM" }, \ > { _XBF_DELWRI_Q, "DELWRI_Q" }, \ > /* The following interface flags should never be set */ \ > + { XBF_BCACHE_SCAN, "BCACHE_SCAN" }, \ > { XBF_INCORE, "INCORE" }, \ > { XBF_TRYLOCK, "TRYLOCK" }, \ > { XBF_UNMAPPED, "UNMAPPED" } > @@ -114,8 +120,12 @@ typedef struct xfs_buftarg { > struct xfs_buf_map { > xfs_daddr_t bm_bn; /* block number for I/O */ > int bm_len; /* size of I/O */ > + unsigned int bm_flags; > }; > > +/* Don't complain about live buffers with the wrong length during lookup. */ > +#define XBM_IGNORE_LENGTH_MISMATCH (1U << 0) Which makes me wonder now: can we have two cached buffers at the same address with different lengths during a repair? Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx