On Thu, Aug 12, 2021 at 04:39:40PM +0100, Matthew Wilcox wrote: > I agree with David; we want something lower-level for swap to call into. > I'd suggest aops->swap_rw and an implementation might well look > something like: > > static ssize_t ext4_swap_rw(struct kiocb *iocb, struct iov_iter *iter) > { > return iomap_dio_rw(iocb, iter, &ext4_iomap_ops, NULL, 0); > } Yes, that might make sense and would also replace the awkward IOCB_SWAP flag for the write side. For file systems like ext4 and xfs that have an in-memory block mapping tree this would be way better than the current version and also support swap on say multi-device file systems properly. We'd just need to be careful to read the extent information in at extent_activate time, by doing xfs_iread_extents for XFS or the equivalents in other file systems.