Re: [PATCH 3/7] xfs: fix a bug in the online fsck directory leaf1 bestcount check

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Dec 16, 2021 at 11:25:49AM -0800, Darrick J. Wong wrote:
> On Thu, Dec 16, 2021 at 04:05:37PM +1100, Dave Chinner wrote:
> > On Wed, Dec 15, 2021 at 05:09:32PM -0800, Darrick J. Wong wrote:
> > > From: Darrick J. Wong <djwong@xxxxxxxxxx>
> > > 
> > > When xfs_scrub encounters a directory with a leaf1 block, it tries to
> > > validate that the leaf1 block's bestcount (aka the best free count of
> > > each directory data block) is the correct size.  Previously, this author
> > > believed that comparing bestcount to the directory isize (since
> > > directory data blocks are under isize, and leaf/bestfree blocks are
> > > above it) was sufficient.
> > > 
> > > Unfortunately during testing of online repair, it was discovered that it
> > > is possible to create a directory with a hole between the last directory
> > > block and isize.
> > 
> > We have xfs_da3_swap_lastblock() that can leave an -empty- da block
> > between the last referenced block and isize, but that's not a "hole"
> > in the file. If you don't mean xfs_da3_swap_lastblock(), then can
> > you clarify what you mean by a "hole" here and explain to me how the
> > situation it occurs in comes about?
> 
> I don't actually know how it comes about.  I wrote a test that sets up
> fsstress to expand and contract directories and races xfs_scrub -n, and
> noticed that I'd periodically get complaints about directories (usually
> $SCRATCH_MNT/p$CPU) where the last block(s) before i_size were actually
> holes.

Is that test getting to ENOSPC at all?

> I began reading the dir2 code to try to figure out how this came about
> (clearly we're not updating i_size somewhere) but then took the shortcut
> of seeing if xfs_repair or xfs_check complained about this situation.
> Neither of them did, and I found a couple more directories in a similar
> situation on my crash test dummy machine, and concluded "Wellllp, I
> guess this is part of the ondisk format!" and committed the patch.
> 
> Also, I thought xfs_da3_swap_lastblock only operates on leaf and da
> btree blocks, not the blocks containing directory entries?

Ah, right you are. I noticed xfs_da_shrink_inode() being called from
leaf_to_block() and thought it might be swapping the leaf with the
last data block that we probably just removed. Looking at the code,
that is not going to happend AFAICT...

> I /think/
> the actual explanation is that something goes wrong in
> xfs_dir2_shrink_inode (maybe?) such that the mapping goes away but
> i_disk_size doesn't get updated?  Not sure how /that/ can happen,
> though...

Actually, the ENOSPC case in xfs_dir2_shrink_inode is the likely
case. If we can't free the block because bunmapi gets ENOSPC due
to xfs_dir_rename() being called without a block reservation, it'll
just get left there as an empty data block. If all the other dir
data blocks around it get removed properly, it could eventually end
up between the last valid entry and isize....

There are lots of weird corner cases around ENOSPC in the directory
code, perhaps this is just another of them...

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux