Re: [PATCH 3/3] xfs: retain the AGI when we can't iget an inode to scrub the core

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Nov 15, 2022 at 03:08:16PM +1100, Dave Chinner wrote:
> On Sun, Oct 02, 2022 at 11:20:29AM -0700, Darrick J. Wong wrote:
> > From: Darrick J. Wong <djwong@xxxxxxxxxx>
> > 
> > xchk_get_inode is not quite the right function to be calling from the
> > inode scrubber setup function.  The common get_inode function either
> > gets an inode and installs it in the scrub context, or it returns an
> > error code explaining what happened.  This is acceptable for most file
> > scrubbers because it is not in their scope to fix corruptions in the
> > inode core and fork areas that cause iget to fail.
> > 
> > Dealing with these problems is within the scope of the inode scrubber,
> > however.  If iget fails with EFSCORRUPTED, we need to xchk_inode to flag
> > that as corruption.  Since we can't get our hands on an incore inode, we
> > need to hold the AGI to prevent inode allocation activity so that
> > nothing changes in the inode metadata.
> > 
> > Looking ahead to the inode core repair patches, we will also need to
> > hold the AGI buffer into xrep_inode so that we can make modifications to
> > the xfs_dinode structure without any other thread swooping in to
> > allocate or free the inode.
> > 
> > Adapt the xchk_get_inode into xchk_setup_inode since this is a one-off
> > use case where the error codes we check for are a little different, and
> > the return state is much different from the common function.
> 
> The code look fine, but...
> 
> ... doesn't this mean that xchk_setup_inode() and xchk_get_inode()
> now are almost identical apart from the xchk_prepare_iscrub() bits?

Yes, they're /nearly/ identical in the helper functions they call, but
they're not so similar in intent and how they handle @error values:

xchk_setup_inode prepares to check or repair an inode record, so it must
continue the scrub operation even if the inode/inobt verifiers cause
xfs_iget to return EFSCORRUPTED.  This is done by attaching the locked
AGI buffer to the scrub transaction and returning 0 to move on to the
actual scrub.  (Later, the online inode repair code will also want the
xfs_imap structure so that it can reset the ondisk xfs_dinode
structure.)

xchk_get_inode retrieves an inode on behalf of a scrubber that operates
on an incore inode -- data/attr/cow forks, directories, xattrs,
symlinks, parent pointers, etc.  If the inode/inobt verifiers fail and
xfs_iget returns EFSCORRUPTED, we want to exit to userspace (because the
caller should be fix the inode first) and drop everything we acquired
along the way.

A behavior common to both functions is that it's possible that xfs_scrub
asked for a scrub-by-handle concurrent with the inode being freed or the
passed-in inumber is invalid.  In this case, we call xfs_imap to see if
the inobt index thinks the inode is allocated, and return ENOENT
("nothing to check here") to userspace if this is not the case.  The
imap lookup is why both functions call xchk_iget_agi.

> This kinda looks like a lot of duplicated but subtly different code
> - does xchk_get_inode() still need all that complexity if we are now
> doing it in xchk_setup_inode()?  If it does, why does
> xchk_setup_inode() need to duplicate the code?

So yes, we do need the complexity of both functions because the
postconditions of the two functions are rather different.

--D

> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@xxxxxxxxxxxxx



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux