On Tue, Jun 02, 2020 at 11:49:36AM -0400, Chris Mason wrote: > On the btrfs side, I’m storing the fsverity data in the btree, so I’m merkle > block size agnostic. Since our rollout is going to be x86, we’ll end up > using the 4k size internally for the current code base. > > My recommendation to simplify the merkle tree code would be to just put it > in slab objects instead pages and leverage recent MM changes to make reclaim > work well. There’s probably still more to do on that front, but it’s a long > standing todo item for Josef to shift the btrfs metadata out of the page > cache, where we have exactly the same problems for exactly the same reasons. Do you have an idea for how to do that without introducing much extra overhead to ext4 and f2fs with Merkle tree block size == PAGE_SIZE? Currently they just cache the Merkle tree pages in the inode's page cache. We don't *have* to do it that way, but anything that adds additional overhead (e.g. reading data into pagecache, then copying it into slab allocations, then freeing the pagecache pages) would be undesirable. We need to keep the overhead minimal. - Eric