Re: [PATCH 4/4] xfs: only iget the file once when doing vectored scrub-by-handle

"Darrick J. Wong" <djwong@xxxxxxxxxx> · Tue, 16 Apr 2024 15:51:19 -0700

On Tue, Apr 16, 2024 at 03:31:48PM -0700, Darrick J. Wong wrote:
> On Mon, Apr 15, 2024 at 10:35:26PM -0700, Christoph Hellwig wrote:
> > >  out_free:
> > > +	/*
> > > +	 * If we're holding the only reference to an inode opened via handle,
> > > +	 * mark it dontcache so that we don't pollute the cache.
> > > +	 */
> > > +	if (handle_ip) {
> > > +		if (atomic_read(&VFS_I(handle_ip)->i_count) == 1)
> > > +			d_mark_dontcache(VFS_I(handle_ip));
> > > +		xfs_irele(handle_ip);
> > 
> > This still feels the wrong way around vs just using IGET_UNCACHED?
> > 
> > Or did I miss the killer argument why that can't work?
> 
> No, I missed the killer argument.  In fact, I think the fsdax
> reorganization into d_mark_dontcache makes this counterproductive.
> 
> Initially, I had thought that we'd be doing users a favor by only
> marking inodes dontcache at the end of a scrub operation, and only if
> there's only one reference to that inode.  This was more or less true
> back when I_DONTCACHE was an XFS iflag and the only thing it did was
> change the outcome of xfs_fs_drop_inode to 1.  If there are dentries
> pointing to the inode when scrub finishes, the inode will have positive
> i_count and stay around in cache until dentry reclaim.
> 
> But now we have d_mark_dontcache, which cause the inode *and* the
> dentries attached to it all to be marked I_DONTCACHE, which means that
> we drop the dentries ASAP, which drops the inode ASAP.
> 
> This is bad if scrub found problems with the inode, because now they can
> be scheduled for inactivation, which can cause inodegc to trip on it and
> shut down the filesystem.
> 
> Even if the inode isn't bad, this is still suboptimal because phases 3-7
> each initiate inode scans.  Dropping the inode immediately during phase
> 3 is silly because phase 5 will reload it and drop it immediately, etc.
> It's fine to mark the inodes dontcache, but if there have been accesses
> to the file that set up dentries, we should keep them.
> 
> I validated this by setting up ftrace to capture xfs_iget_recycle*
> tracepoints and ran xfs/285 for 30 seconds.  With current djwong-wtf I
> saw ~30,000 recycle events.  I then dropped the d_mark_dontcache calls
> and set XFS_IGET_DONTCACHE, and the recycle events dropped to ~5,000 per
> 30 seconds.
> 
> So I think I want to change xchk_irele to clear I_DONTCACHE if we're in
> transaction context or if corruption was found; and to use
> XFS_IGET_DONTCACHE instead of the i_count-based d_mark_dontcache calls.

Actually we don't even need to clear DONTCACHE on sick inodes; commit
7975e465af6b4 ("xfs: drop IDONTCACHE on inodes when we mark them sick")
already took care of that.

--D

> --D
>