Hi Johannes, > Wouldn't it make more sense to decouple filesystems from "paginess", > as David puts it, now instead? Avoid the risk of doing it twice, avoid > the more questionable churn inside mm code, avoid the confusing > proximity to the page and its API in the long-term... Let me seize that opening. I've been working on doing this for network filesystems - at least those that want to buy in. If you look here: https://lore.kernel.org/ceph-devel/162687506932.276387.14456718890524355509.stgit@xxxxxxxxxxxxxxxxxxxxxx/T/#m23428c315a77d8c5206b9646bf74c8ef18d4d38c the current state of which is here: https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git/log/?h=netfs-folio-regions I've been looking at abstracting anything to do with pages out of the netfs and putting that stuff into a helper library. The library handles all the caching stuff and just presents the filesystem with requests to read into/write from an iov_iter. The filesystem doesn't then see pages at all. The motivation behind this is to make content encryption and compression transparent and automatically available to all participating filesystems - with the requirement that the data stored in the local disk cache (ie. fscache) is *also* encrypted. I have content encryption working for basic read and write on afs and Jeff Layton is looking at how to make it work with ceph - but it's very much a work in progress and things like truncate and mmap don't yet work with it. Anyway, the library, as I'm currently writing it, maintains a list of byte-range dirty regions on each inode, where a dirty region may span multiple folios and a folio may be contributory to multiple regions. The fact that pages are involved is really then merely an implementation detail Content encryption/compression blocks may be any power-of-2 size, from 2 bytes to megabytes, and this need bear no relation to page size. The library calls the crypto hooks for each crypto block in the chunk[*] to be crypted. [*] Terminology is such fun. I have to deal with pages, crypto blocks, object layout blocks, I/O blocks (rsize/wsize settings), regions. In fact ->readpage(), ->writepage() and ->launder_page() are difficult when I may be required to deal with blocks larger than the size of a page. The page being poked may be in the middle of a block, so I'm endeavouring to work around that. Using the regions should allow me to 'launder' an inode before invalidating the pages attached to it, and the dirty region objects can act instead of the dirty, writeback and fscache flags on a page. I've been building this on top of Willy's folio patchset, and so I've paused for the moment whilst I wait to see what becomes of that. If folios doesn't get in or gets renamed, I have a load of reworking to do. Does this sound like something you'd be interested in looking at more generally than just network filesystems? David -- Linux-cachefs mailing list Linux-cachefs@xxxxxxxxxx https://listman.redhat.com/mailman/listinfo/linux-cachefs