On Sun, Jul 09, 2023 at 04:32:16PM -0700, Darrick J. Wong wrote: > On Mon, Jul 10, 2023 at 09:15:06AM +1000, Dave Chinner wrote: > > On Wed, Jul 05, 2023 at 04:17:36PM -0700, Darrick J. Wong wrote: > > > Concrete example: > > > > > > So let's pretend that we have an xfs with fs blocksize 4k and dir > > > blocksize 8k. Let's suppose the directory's data fork maps fsblock 16, > > > resulting in a buffer (daddr 128, length 16). Let us further suppose > > > the inobt then allocates fsblock 17, resulting in a buffer (daddr 136, > > > length 8). These are crosslinked. > > > > RIght, that's an intersection of {128,16} and {136,8}. The search > > region for buffer invalidation is {128, 8} {128, 16} and {136, 8}. > > They are the only possible buffers that can be cached in that region > > for a 4kB block size filesystem, as there can be no metadata buffers > > starting at daddrs 129-135 or 137-143... > > <nod> There's nothing in this flag that *prevents* someone from spending > a lot of time pounding on the buffer cache with a per-daddr scan. But > note that the next patch only ever calls it with fsblock-aligned daddr > and length values. So in the end, the only xfs_buf queries will indded > be for {128, 8} {128, 16} and {136, 8} like you said. > > See the next patch for the actual usage. Sorry that I've been unclear > about all this and compounding it with not very good examples. It's > been a /very/ long time since I actually touched this code (this rewrite > has been waiting for merge since ... 2019?) and I'm basically coming in > cold. :( > > Ultimately I make invalidation scan code look like this: > > /* Buffer cache scan context. */ > struct xrep_bufscan { > /* Disk address for the buffers we want to scan. */ > xfs_daddr_t daddr; > > /* Maximum number of sectors to scan. */ > xfs_daddr_t max_sectors; > > /* Each round, increment the search length by this number of sectors. */ > xfs_daddr_t daddr_step; > > /* Internal scan state; initialize to zero. */ > xfs_daddr_t __sector_count; > }; > > /* > * Return an incore buffer from a sector scan, or NULL if there are no buffers > * left to return. > */ > struct xfs_buf * > xrep_bufscan_advance( > struct xfs_mount *mp, > struct xrep_bufscan *scan) > { > scan->__sector_count += scan->daddr_step; > while (scan->__sector_count <= scan->max_sectors) { > struct xfs_buf *bp = NULL; > int error; > > error = xfs_buf_incore(mp->m_ddev_targp, scan->daddr, > scan->__sector_count, XBF_LIVESCAN, &bp); > if (!error) > return bp; > > scan->__sector_count += scan->daddr_step; > } > > return NULL; > } ..... Yup, that's much better, will do for now. I suspect this could be smarter with a custom rhashtable walker; all the buffers at a given daddr should have the same key (i.e. daddr) so should hash to the same list, so we should be able to walk an entire list in one pass doing a bjoin/binval callback on each non-stale buffer that matches the key.... But that can be done later, it's not necessary for correctness and this looks a lot better than the original code. Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx