On Wed, Apr 10, 2024 at 08:12:16AM -0700, Christoph Hellwig wrote: > > + /* > > + * If the caller wants us to do a scrub-by-handle and the file used to > > + * call the ioctl is not the same file, load the incore inode and pin > > + * it across all the scrubv actions to avoid repeated UNTRUSTED > > + * lookups. The reference is not passed to deeper layers of scrub > > + * because each scrubber gets to decide its own strategy for getting an > > + * inode. > > + */ > > + if (vhead->svh_ino && vhead->svh_ino != ip_in->i_ino) > > + handle_ip = xchk_scrubv_open_by_handle(mp, vhead); > > Oh. So we read the inode, keep a reference to it, but still hit the > inode cache every time. A little non-onvious and not perfect for > performance, but based on your numbers probably good enough. > > Curious: what is the reason the scrubbers want/need different ways to > get at the inode? I don't remember the exact reason why we don't pass this ip into xfs_scrub_metadata, but iirc the inode scrub setup functions react differently (from the bmap/dir/attr/symlink scrubbers) when iget failures occur. Also this way xfs_scrub_metadata owns the refcount to whatever inode it picks up, and can do whatever it wants with that reference. > > + /* > > + * If we're holding the only reference to an inode opened via handle > > + * and the scan was clean, mark it dontcache so that we don't pollute > > + * the cache. > > + */ > > + if (handle_ip) { > > + if (set_dontcache && > > + atomic_read(&VFS_I(handle_ip)->i_count) == 1) > > + d_mark_dontcache(VFS_I(handle_ip)); > > + xfs_irele(handle_ip); > > + } > > This looks a little weird to me. Can't we simply use XFS_IGET_DONTCACHE > at iget time and then clear I_DONTCACHE here if we want to keep the > inode around? Not anymore, because other threads can mess around with the dontcache state (yay fsdax access path changes!!) while we are scrubbing the inode. > Given that we only set the uncached flag from > XFS_IGET_DONTCACHE on a cache miss, we won't have set > DCACHE_DONTCACHE anywhere (and don't really care about the dentries to > start with). > > But why do we care about keeping the inodes with errors in memory > here, but not elsewhere? We actually, do, but it's not obvious... > Maybe this can be explained in an expanded comment. ...because this bit here is basically the same as xchk_irele, but we don't have a xfs_scrub object to pass in, so it's opencoded. I could pull this logic out into: void xfs_scrub_irele(struct xfs_inode *ip) { if (atomic_read(&VFS_I(ip)->i_count) == 1) { /* * If this is the last reference to the inode and the caller * permits it, set DONTCACHE to avoid thrashing. */ d_mark_dontcache(VFS_I(ip)); } xfs_irele(ip); } and change xchk_irele to: void xchk_irele( struct xfs_scrub *sc, struct xfs_inode *ip) { if (sc->tp) { spin_lock(&VFS_I(ip)->i_lock); VFS_I(ip)->i_state &= ~I_DONTCACHE; spin_unlock(&VFS_I(ip)->i_lock); xfs_irele(ip); return; } xfs_scrub_irele(ip); }