On Wed, Apr 05, 2023 at 06:16:00PM +0000, Eric Biggers wrote: > On Wed, Apr 05, 2023 at 09:38:47AM -0700, Darrick J. Wong wrote: > > > The merkle tree pages are dropped after verification. When page is > > > dropped xfs_buf is marked as verified. If fs-verity wants to > > > verify again it will get the same verified buffer. If buffer is > > > evicted it won't have verified state. > > > > > > So, with enough memory pressure buffers will be dropped and need to > > > be reverified. > > > > Please excuse me if this was discussed and rejected long ago, but > > perhaps fsverity should try to hang on to the merkle tree pages that > > this function returns for as long as possible until reclaim comes for > > them? > > > > With the merkle tree page lifetimes extended, you then don't need to > > attach the xfs_buf to page->private, nor does xfs have to extend the > > buffer cache to stash XBF_VERITY_CHECKED. > > Well, all the other filesystems that support fsverity (ext4, f2fs, and btrfs) > just cache the Merkle tree pages in the inode's page cache. It's an approach > that I know some people aren't a fan of, but it's efficient and it works. Which puts pages beyond EOF in the page cache. Given that XFS also allows persistent block allocation beyond EOF, having both data in the page cache and blocks beyond EOF that contain unrelated information is a Real Bad Idea. Just because putting metadata in the file data address space works for one filesystem, it doesn't me it's a good idea or that it works for every filesystem. > We could certainly think about moving to a design where fs/verity/ asks the > filesystem to just *read* a Merkle tree block, without adding it to a cache, and > then fs/verity/ implements the caching itself. That would require some large > changes to each filesystem, though, unless we were to double-cache the Merkle > tree blocks which would be inefficient. No, that's unnecessary. All we need if for fsverity to require filesystems to pass it byte addressable data buffers that are externally reference counted. The filesystem can take a page reference before mapping the page and passing the kaddr to fsverity, then unmap and drop the reference when the merkle tree walk is done as per Andrey's new drop callout. fsverity doesn't need to care what the buffer is made from, how it is cached, what it's life cycle is, etc. The caching mechanism and reference counting is entirely controlled by the filesystem callout implementations, and fsverity only needs to deal with memory buffers that are guaranteed to live for the entire walk of the merkle tree.... Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx