Hi Neil, On Thu, Dec 16, 2021 at 7:07 PM NeilBrown <neilb@xxxxxxx> wrote: > > swap-over-NFS currently has a variety of problems. > > swap writes call generic_write_checks(), which always fails on a swap > file, so it completely fails. > Even without this, various deadlocks are possible - largely due to > improvements in NFS memory allocation (using NOFS instead of ATOMIC) > which weren't tested against swap-out. > > NFS is the only filesystem that has supported fs-based swap IO, and it > hasn't worked for several releases, so now is a convenient time to clean > up the swap-via-filesystem interfaces - we cannot break anything ! > > So the first few patches here clean up and improve various parts of the > swap-via-filesystem code. ->activate_swap() is given a cleaner > interface, a new ->swap_rw is introduced instead of burdening > ->direct_IO, etc. > > Current swap-to-filesystem code only ever submits single-page reads and > writes. These patches change that to allow multi-page IO when adjacent > requests are submitted. Writes are also changed to be async rather than > sync. This substantially speeds up write throughput for swap-over-NFS. > > Some of the NFS patches can land independently of the MM patches. A few > require the MM patches to land first. Thanks for fixing swap-over-NFS! Looks like it passes all the swap-related xfstests except for generic/357 on NFS v4.2. This test checks that we get -EINVAL on a reflinked swapfile, but I'm not sure if there is a way to check for that on the client side but if you have any ideas it would be nice to get that test passing while you're at it! Anna > > Thanks, > NeilBrown > > > --- > > NeilBrown (18): > Structural cleanup for filesystem-based swap > MM: create new mm/swap.h header file. > MM: use ->swap_rw for reads from SWP_FS_OPS swap-space > MM: perform async writes to SWP_FS_OPS swap-space > MM: reclaim mustn't enter FS for SWP_FS_OPS swap-space > MM: submit multipage reads for SWP_FS_OPS swap-space > MM: submit multipage write for SWP_FS_OPS swap-space > MM: Add AS_CAN_DIO mapping flag > NFS: rename nfs_direct_IO and use as ->swap_rw > NFS: swap IO handling is slightly different for O_DIRECT IO > SUNRPC/call_alloc: async tasks mustn't block waiting for memory > SUNRPC/auth: async tasks mustn't block waiting for memory > SUNRPC/xprt: async tasks mustn't block waiting for memory > SUNRPC: remove scheduling boost for "SWAPPER" tasks. > NFS: discard NFS_RPC_SWAPFLAGS and RPC_TASK_ROOTCREDS > SUNRPC: improve 'swap' handling: scheduling and PF_MEMALLOC > NFSv4: keep state manager thread active if swap is enabled > NFS: swap-out must always use STABLE writes. > > > drivers/block/loop.c | 4 +- > fs/fcntl.c | 5 +- > fs/inode.c | 3 + > fs/nfs/direct.c | 56 ++++++---- > fs/nfs/file.c | 25 +++-- > fs/nfs/inode.c | 1 + > fs/nfs/nfs4_fs.h | 1 + > fs/nfs/nfs4proc.c | 20 ++++ > fs/nfs/nfs4state.c | 39 ++++++- > fs/nfs/read.c | 4 - > fs/nfs/write.c | 2 + > fs/open.c | 2 +- > fs/overlayfs/file.c | 10 +- > include/linux/fs.h | 2 +- > include/linux/nfs_fs.h | 11 +- > include/linux/nfs_xdr.h | 2 + > include/linux/pagemap.h | 3 +- > include/linux/sunrpc/auth.h | 1 + > include/linux/sunrpc/sched.h | 1 - > include/linux/swap.h | 121 -------------------- > include/linux/writeback.h | 7 ++ > include/trace/events/sunrpc.h | 1 - > mm/madvise.c | 9 +- > mm/memory.c | 3 +- > mm/mincore.c | 1 + > mm/page_alloc.c | 1 + > mm/page_io.c | 189 ++++++++++++++++++++++++++------ > mm/shmem.c | 1 + > mm/swap.h | 140 +++++++++++++++++++++++ > mm/swap_state.c | 32 ++++-- > mm/swapfile.c | 6 + > mm/util.c | 1 + > mm/vmscan.c | 31 +++++- > net/sunrpc/auth.c | 8 +- > net/sunrpc/auth_gss/auth_gss.c | 6 +- > net/sunrpc/auth_unix.c | 10 +- > net/sunrpc/clnt.c | 7 +- > net/sunrpc/sched.c | 29 +++-- > net/sunrpc/xprt.c | 19 ++-- > net/sunrpc/xprtrdma/transport.c | 10 +- > net/sunrpc/xprtsock.c | 8 ++ > 41 files changed, 558 insertions(+), 274 deletions(-) > create mode 100644 mm/swap.h > > -- > Signature >