On Tue, Mar 29, 2022 at 10:52:04AM +0530, Chandan Babu R wrote: > On 25 Mar 2022 at 03:44, Dave Chinner wrote: > > On Mon, Mar 21, 2022 at 10:47:46AM +0530, Chandan Babu R wrote: > >> The maximum file size that can be represented by the data fork extent counter > >> in the worst case occurs when all extents are 1 block in length and each block > >> is 1KB in size. > >> > >> With XFS_MAX_EXTCNT_DATA_FORK_SMALL representing maximum extent count and with > >> 1KB sized blocks, a file can reach upto, > >> (2^31) * 1KB = 2TB > >> > >> This is much larger than the theoretical maximum size of a directory > >> i.e. 32GB * 3 = 96GB. > >> > >> Since a directory's inode can never overflow its data fork extent counter, > >> this commit replaces checking the return value of > >> xfs_iext_count_may_overflow() with calls to ASSERT(error == 0). > > > > I'd really prefer that we don't add noise like this to a bunch of > > call sites. If directories can't overflow the extent count in > > normal operation, then why are we even calling > > xfs_iext_count_may_overflow() in these paths? i.e. an overflow would > > be a sign of an inode corruption, and we should have flagged that > > long before we do an operation that might overflow the extent count. > > > > So, really, I think you should document the directory size > > constraints at the site where we define all the large extent count > > values in xfs_format.h, remove the xfs_iext_count_may_overflow() > > checks from the directory code and replace them with a simple inode > > verifier check that we haven't got more than 100GB worth of > > individual extents in the data fork for directory inodes.... > > I don't think that we could trivially verify if the extents in a directory's > data fork add up to more than 96GB. dip->di_nextents tells us how many extents there are in the data fork, we know what the block size of the filesystem is, so it should be pretty easy to calculate a maximum extent count for 96GB of space. i.e. absolute maximum valid dir data fork extent count is (96GB / blocksize). > > xfs_dinode->di_size tracks the size of XFS_DIR2_DATA_SPACE. This also includes > holes that could be created by freeing directory entries in a single directory > block. Also, there is no easy method to determine the space occupied by > XFS_DIR2_LEAF_SPACE and XFS_DIR2_FREE_SPACE segments of a directory. Sure there is. We do this sort of calc for things like transaction reservations via definitions like XFS_DA_NODE_MAXDEPTH. That tells us immediately how many blocks can be in the XFS_DIR2_LEAF_SPACE segement.... We also know the maximum number of individual directory blocks in the 32GB segment (fixed at 32GB / dir block size), so the free space array is also a fixed size at (32GB / dir block size / free space entries per block). It's easy to just use (96GB / block size) and that will catch most corruptions with no risk of a false positive detection, but we could quite easily refine this to something like: data (32GB + leaf btree blocks(XFS_DA_NODE_MAXDEPTH) + freesp (32GB / free space records per block)) frags / filesystem block size > May be the following can be added to xfs_dinode_verify(), > > if (S_ISDIR(mode) && ((xfs_dinode->di_size + 2 * 32GB) > 96GB)) > return __this_address That doesn't validate that the on disk or in-memory di_nextents value is withing the known valid range or not. We can do that directly (as per above), so we shouldn't need a hueristic like this. Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx