Re: [PATCH 2/2] mm: Make swap_readpage() for SWP_FS_OPS use ->direct_IO() not ->readpage()

Christoph Hellwig <hch@xxxxxx> · Fri, 13 Aug 2021 08:54:26 +0200

On Thu, Aug 12, 2021 at 10:48:18AM -0700, Darrick J. Wong wrote:
> On Thu, Aug 12, 2021 at 07:02:33PM +0200, Christoph Hellwig wrote:
> > On Thu, Aug 12, 2021 at 04:39:40PM +0100, Matthew Wilcox wrote:
> > > I agree with David; we want something lower-level for swap to call into.
> > > I'd suggest aops->swap_rw and an implementation might well look
> > > something like:
> > > 
> > > static ssize_t ext4_swap_rw(struct kiocb *iocb, struct iov_iter *iter)
> > > {
> > > 	return iomap_dio_rw(iocb, iter, &ext4_iomap_ops, NULL, 0);
> > > }
> > 
> > Yes, that might make sense and would also replace the awkward IOCB_SWAP
> > flag for the write side.
> > 
> > For file systems like ext4 and xfs that have an in-memory block mapping
> > tree this would be way better than the current version and also support
> > swap on say multi-device file systems properly.  We'd just need to be
> > careful to read the extent information in at extent_activate time,
> > by doing xfs_iread_extents for XFS or the equivalents in other file
> > systems.
> 
> You'd still want to walk the extent map at activation time to reject
> swapfiles with holes, shared extents, etc., right?

Yes.  While direct I/O code could do allocation at swap I/O time that
probably is not a good idea due to the memory requirements.