On Thu, May 09, 2024 at 01:02:50PM -0700, Darrick J. Wong wrote: > Thinking about this further, I think building the merkle tree becomes a > lot more difficult than the current design. At first I thought of > reserving a static partition in the attr fork address range, but got > bogged donw in figuring out how big the static partition has to be. > > Downthread I realized that the maximum size of a merkle tree is actually > ULONG_MAX blocks, which means that on a 64-bit machine there effectively > is no limit. Do we care about using up the limit? Remember that in ext4/f2fs the merkle tree is stored in what XFS calls the data fork, so the file data plus the merkle tree have to fit into the size limit, be that S64_MAX or lower limit imposed by the page cache. And besides being the limit imposed by the current most common implementation (I haven't checked btrfs as the only other one), that does seem like a pretty reasonable one. > That led me to the idea of dynamic partitioning, where we find a sparse > part of the attr fork fileoff range and use that. That burns a lot less > address range but means that we cannot elide merkle tree blocks that > contain entirely hash(zeroes) because elided blocks become sparse holes > in the attr fork, and xfs_bmap_first_unused can still find those holes. xfs_bmap_first_unused currently finds them. It should not as it's callers are limited to 32-bit addressing. I'll send a patch to make that clear. > Setting even /that/ aside, how would we allocate/map the range? IFF we stick to a static range (which I think still make sense), that range would be statically reserved and should exist if the VERITY bit is set on the inode, and the size is calculated from the file size. If not we'd indeed need to record the mapping somewhere, and an attr would be the right place. It still feels like going down a rabit hole for no obvious benefit to me.