On Tue, Jun 20, 2023 at 03:47:08PM +1000, Dave Chinner wrote: > On Thu, May 25, 2023 at 05:45:03PM -0700, Darrick J. Wong wrote: > > From: Darrick J. Wong <djwong@xxxxxxxxxx> > > > > When we're freeing extents that have been set in a bitmap, break the > > bitmap extent into multiple sub-extents organized by fate, and reap the > > extents. This enables us to dispose of old resources more efficiently > > than doing them block by block. > > > > While we're at it, rename the reaping functions to make it clear that > > they're reaping per-AG extents. > > > > Signed-off-by: Darrick J. Wong <djwong@xxxxxxxxxx> > > --- > > ..... > > +xreap_agextent_binval( > > + struct xreap_state *rs, > > + xfs_agblock_t agbno, > > + xfs_extlen_t *aglenp) > > { > > - struct xfs_buf *bp = NULL; > > - int error; > > + struct xfs_scrub *sc = rs->sc; > > + struct xfs_perag *pag = sc->sa.pag; > > + struct xfs_mount *mp = sc->mp; > > + xfs_agnumber_t agno = sc->sa.pag->pag_agno; > > + xfs_agblock_t agbno_next = agbno + *aglenp; > > + xfs_agblock_t bno = agbno; > > > > /* > > - * If there's an incore buffer for exactly this block, invalidate it. > > * Avoid invalidating AG headers and post-EOFS blocks because we never > > * own those. > > */ > > - if (!xfs_verify_fsbno(sc->mp, fsbno)) > > + if (!xfs_verify_agbno(pag, agbno) || > > + !xfs_verify_agbno(pag, agbno_next - 1)) > > return; > > > > /* > > - * We assume that the lack of any other known owners means that the > > - * buffer can be locked without risk of deadlocking. > > + * If there are incore buffers for these blocks, invalidate them. We > > + * assume that the lack of any other known owners means that the buffer > > + * can be locked without risk of deadlocking. The buffer cache cannot > > + * detect aliasing, so employ nested loops to scan for incore buffers > > + * of any plausible size. > > */ > > - error = xfs_buf_incore(sc->mp->m_ddev_targp, > > - XFS_FSB_TO_DADDR(sc->mp, fsbno), > > - XFS_FSB_TO_BB(sc->mp, 1), XBF_BCACHE_SCAN, &bp); > > - if (error) > > - return; > > - > > - xfs_trans_bjoin(sc->tp, bp); > > - xfs_trans_binval(sc->tp, bp); > > + while (bno < agbno_next) { > > + xfs_agblock_t fsbcount; > > + xfs_agblock_t max_fsbs; > > + > > + /* > > + * Max buffer size is the max remote xattr buffer size, which > > + * is one fs block larger than 64k. > > + */ > > + max_fsbs = min_t(xfs_agblock_t, agbno_next - bno, > > + xfs_attr3_rmt_blocks(mp, XFS_XATTR_SIZE_MAX)); > > + > > + for (fsbcount = 1; fsbcount < max_fsbs; fsbcount++) { > > + struct xfs_buf *bp = NULL; > > + xfs_daddr_t daddr; > > + int error; > > + > > + daddr = XFS_AGB_TO_DADDR(mp, agno, bno); > > + error = xfs_buf_incore(mp->m_ddev_targp, daddr, > > + XFS_FSB_TO_BB(mp, fsbcount), > > + XBF_BCACHE_SCAN, &bp); > > + if (error) > > + continue; > > + > > + xfs_trans_bjoin(sc->tp, bp); > > + xfs_trans_binval(sc->tp, bp); > > + rs->invalidated++; > > Hmmm. O(N^2) brute force lookup walk to find any buffer at that > specific daddr? That's going to have an impact on running systems > by hammering the perag hash lock. > > I didn't know this was being done before I suggested XBF_ANY_MATCH, > but now I'm wondering how can we even have multiple buffers in the > cache at a given address? The whole point of the ASSERT() in the > match function is to indicate this should not ever happen. > > i.e. xfs_buf_find_insert() uses rhashtable_lookup_get_insert_fast(), > which will return an existing buffer only if it has a match length. > The compare function used at insert time will throw an assert fail > if any other buffer exists at that address that has a mismatched > length that is not stale. There is no way to avoid that ASSERT check > on insert. > > Hence, AFAICT, the only way we can get multiple buffers into the > cache at the same daddr with different lengths is for all the > existing buffers at that daddr to all be stale at insert time. In > which case, why do we need to invalidate buffers that are already > stale (i.e. been invalidated)? > > What am I missing here? i.e. this appears to cater for behaviour The same braino that I mentioned in my previous replies -- I failed to state that we're looking for buffers within a specific range of fsblocks that do /not/ start at the same addr. > that doesn't currently exist in the buffer cache, and I'm not sure > we even want to allow to occur in the buffer cache given that it > generally indicates in-memory metadata corruption..... > > Hmmmm. What if the buffer was already stale? The lookup will then Urk. Yes, that results in the buffer's stale state being cleared and then the buffer gets cancelled *again*. It's unnecessary and we can skip the second invalidation. I think I'll change the name back to its former name XBF_LIVESCAN, which means "only return non-stale buffers, and don't complain about finding something with the same daddr but a different length from what the caller asked for". --D > clear all the stale state from it, leave it with no valid contents > which we then invalidate again and log a new buffer cancel item for > it. What are the implications of that? (My brain is too full of > stuff trying to understand what this code is doing to be able to > think this through right now.) > > -Dave. > -- > Dave Chinner > david@xxxxxxxxxxxxx