On Thu, Jun 03, 2021 at 02:21:07PM +1000, Dave Chinner wrote: > On Wed, Jun 02, 2021 at 08:12:41PM -0700, Darrick J. Wong wrote: > > From: Darrick J. Wong <djwong@xxxxxxxxxx> > > > > While running some fuzz tests on inode metadata, I noticed that the > > filesystem health report (as provided by xfs_spaceman) failed to report > > the file corruption even when spaceman was run immediately after running > > xfs_scrub to detect the corruption. That isn't the intended behavior; > > one ought to be able to run scrub to detect errors in the ondisk > > metadata and be able to access to those reports for some time after the > > scrub. > > > > After running the same sequence through an instrumented kernel, I > > discovered the reason why -- scrub igets the file, scans it, marks it > > sick, and ireleases the inode. When the VFS lets go of the incore > > inode, it moves to RECLAIMABLE state. If spaceman igets the incore > > inode before it moves to RECLAIM state, iget reinitializes the VFS > > state, clears the sick and checked masks, and hands back the inode. At > > this point, the caller has the exact same incore inode, but with all the > > health state erased. > > > > In other words, we're erasing the incore inode's health state flags when > > we've decided NOT to sever the link between the incore inode and the > > ondisk inode. This is wrong, so we need to remove the lines that zero > > the fields from xfs_iget_cache_hit. > > > > As a precaution, we add the same lines into xfs_reclaim_inode just after > > we sever the link between incore and ondisk inode. Strictly speaking > > this isn't necessary because once an inode has gone through reclaim it > > must go through xfs_inode_alloc (which also zeroes the state) and > > xfs_iget is careful to check for mismatches between the inode it pulls > > out of the radix tree and the one it wants. > > > > Fixes: 6772c1f11206 ("xfs: track metadata health status") > > Signed-off-by: Darrick J. Wong <djwong@xxxxxxxxxx> > > --- > > fs/xfs/xfs_icache.c | 5 ++--- > > 1 file changed, 2 insertions(+), 3 deletions(-) > > Looks fine. > > Reviewed-by: Dave Chinner <dchinner@xxxxxxxxxx> > > Though I do wonder how long such state will hang around, because > once the inode is IRECLAIMABLE and clean it is only a matter of > seconds before the background inode reclaimer will free it... A future patchset will add the ability to make the per-AG health status remember that we were forced to reclaim a sick inode: https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=indirect-health-reporting This series used to be the "fixes first" part of that code, but since it directly intersects with the deferred inactivation changes, I moved it up to try to fix the problem sooner than later. Anyway, thanks for the review. --D > > Cheers, > > Dave. > -- > Dave Chinner > david@xxxxxxxxxxxxx