On Fri, Dec 17, 2021 at 08:17:48AM +1100, Dave Chinner wrote: > On Thu, Dec 16, 2021 at 11:25:49AM -0800, Darrick J. Wong wrote: > > On Thu, Dec 16, 2021 at 04:05:37PM +1100, Dave Chinner wrote: > > > On Wed, Dec 15, 2021 at 05:09:32PM -0800, Darrick J. Wong wrote: > > > > From: Darrick J. Wong <djwong@xxxxxxxxxx> > > > > > > > > When xfs_scrub encounters a directory with a leaf1 block, it tries to > > > > validate that the leaf1 block's bestcount (aka the best free count of > > > > each directory data block) is the correct size. Previously, this author > > > > believed that comparing bestcount to the directory isize (since > > > > directory data blocks are under isize, and leaf/bestfree blocks are > > > > above it) was sufficient. > > > > > > > > Unfortunately during testing of online repair, it was discovered that it > > > > is possible to create a directory with a hole between the last directory > > > > block and isize. > > > > > > We have xfs_da3_swap_lastblock() that can leave an -empty- da block > > > between the last referenced block and isize, but that's not a "hole" > > > in the file. If you don't mean xfs_da3_swap_lastblock(), then can > > > you clarify what you mean by a "hole" here and explain to me how the > > > situation it occurs in comes about? > > > > I don't actually know how it comes about. I wrote a test that sets up > > fsstress to expand and contract directories and races xfs_scrub -n, and > > noticed that I'd periodically get complaints about directories (usually > > $SCRATCH_MNT/p$CPU) where the last block(s) before i_size were actually > > holes. > > Is that test getting to ENOSPC at all? Yes. That particular VM has a generous 8GB of SCRATCH_DEV to make the repairs more interesting. > > I began reading the dir2 code to try to figure out how this came about > > (clearly we're not updating i_size somewhere) but then took the shortcut > > of seeing if xfs_repair or xfs_check complained about this situation. > > Neither of them did, and I found a couple more directories in a similar > > situation on my crash test dummy machine, and concluded "Wellllp, I > > guess this is part of the ondisk format!" and committed the patch. > > > > Also, I thought xfs_da3_swap_lastblock only operates on leaf and da > > btree blocks, not the blocks containing directory entries? > > Ah, right you are. I noticed xfs_da_shrink_inode() being called from > leaf_to_block() and thought it might be swapping the leaf with the > last data block that we probably just removed. Looking at the code, > that is not going to happend AFAICT... > > > I /think/ > > the actual explanation is that something goes wrong in > > xfs_dir2_shrink_inode (maybe?) such that the mapping goes away but > > i_disk_size doesn't get updated? Not sure how /that/ can happen, > > though... > > Actually, the ENOSPC case in xfs_dir2_shrink_inode is the likely > case. If we can't free the block because bunmapi gets ENOSPC due > to xfs_dir_rename() being called without a block reservation, it'll > just get left there as an empty data block. If all the other dir > data blocks around it get removed properly, it could eventually end > up between the last valid entry and isize.... > > There are lots of weird corner cases around ENOSPC in the directory > code, perhaps this is just another of them... <nod> The next time I reproduce it, I'll send you a metadump. --D > > Cheers, > > Dave. > -- > Dave Chinner > david@xxxxxxxxxxxxx