Hi, Ilya, Xiubo, Greg, I'm trying to finish my patches to make ceph work with netfslib and I'm wondering if snap handling on inodes can be made easier to work with. Also, I think there may be a bug in the interaction between ceph_queue_cap_snap() and writable mmaps. What I would like to do is to make page/folio->private point at the ceph_cap_snap struct instead of pointing to ceph_snap_context. This makes it easier to fish the metadata details out in ceph when netfslib asks it to perform a write operation. Netfslib has the capability to pass an netfs_group struct through the API, and I currently have this subclassed by ceph_snap_context, but that doesn't directly carry sufficient information as I presume that's a global thing and not an inode-specific thing. However, it looks like capsnaps don't always exist, even on dirty inodes... So what I'm thinking is: (1) Make struct ceph_cap_snap a subclass of netfs_group. This would allow netfslib to manipulate them and attach them to dirty pages and do selective writeback. (2) Always keep a ceph_cap_snap on a dirty inode. It can be treated specially when it's the only snap and at the head. (3) Offload some of the fields from ceph_inode_info into ceph_cap_snap (eg. truncate_size and truncate_seq) and update them directly there. (4) On entry to any sort of write routine, see if we need a new capsnap for that inode and, if so, create one. This would include ->write_iter(), ->page_mkwrite(), ->setattr(), possibly ->setxattr(), (5) In queue_realm_cap_snaps(), mark the capsnap as being obsolete and call unmap_mapping_pages() on each inode to force ->page_mkwrite() to be called[!] on further modification. queue_realm_cap_snaps() doesn't then need to create a new snapcap; this can be left to the various write routines. [!] This would fix the aforementioned potential bug whereby someone can continue writing to the inode even though a new snap has happened. (6) ceph_writepages() calls netfs_writepages_group() to flush out pages with the matching group, stepping through the capsnap list on the inode. Any thoughts on whether this would work? If I can do this, I can reduce get_oldest_context() to almost nothing and don't need the ceph_writeback_ctl struct anymore (I think). Thanks, David