Re: [PATCH 00/33] Network fs helper library & fscache kiocb API [ver #3]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 2021-02-15 at 15:44 +0000, David Howells wrote:
> Here's a set of patches to do two things:
> 
>  (1) Add a helper library to handle the new VM readahead interface.  This
>      is intended to be used unconditionally by the filesystem (whether or
>      not caching is enabled) and provides a common framework for doing
>      caching, transparent huge pages and, in the future, possibly fscrypt
>      and read bandwidth maximisation.  It also allows the netfs and the
>      cache to align, expand and slice up a read request from the VM in
>      various ways; the netfs need only provide a function to read a stretch
>      of data to the pagecache and the helper takes care of the rest.
> 
>  (2) Add an alternative fscache/cachfiles I/O API that uses the kiocb
>      facility to do async DIO to transfer data to/from the netfs's pages,
>      rather than using readpage with wait queue snooping on one side and
>      vfs_write() on the other.  It also uses less memory, since it doesn't
>      do buffered I/O on the backing file.
> 
>      Note that this uses SEEK_HOLE/SEEK_DATA to locate the data available
>      to be read from the cache.  Whilst this is an improvement from the
>      bmap interface, it still has a problem with regard to a modern
>      extent-based filesystem inserting or removing bridging blocks of
>      zeros.  Fixing that requires a much greater overhaul.
> 
> This is a step towards overhauling the fscache API.  The change is opt-in
> on the part of the network filesystem.  A netfs should not try to mix the
> old and the new API because of conflicting ways of handling pages and the
> PG_fscache page flag and because it would be mixing DIO with buffered I/O.
> Further, the helper library can't be used with the old API.
> 
> This does not change any of the fscache cookie handling APIs or the way
> invalidation is done.
> 
> In the near term, I intend to deprecate and remove the old I/O API
> (fscache_allocate_page{,s}(), fscache_read_or_alloc_page{,s}(),
> fscache_write_page() and fscache_uncache_page()) and eventually replace
> most of fscache/cachefiles with something simpler and easier to follow.
> 
> The patchset contains five parts:
> 
>  (1) Some helper patches, including provision of an ITER_XARRAY iov
>      iterator and a function to do readahead expansion.
> 
>  (2) Patches to add the netfs helper library.
> 
>  (3) A patch to add the fscache/cachefiles kiocb API.
> 
>  (4) Patches to add support in AFS for this.
> 
>  (5) Patches from Jeff Layton to add support in Ceph for this.
> 
> Dave Wysochanski also has patches for NFS for this, though they're not
> included on this branch as there's an issue with PNFS.
> 
> With this, AFS without a cache passes all expected xfstests; with a cache,
> there's an extra failure, but that's also there before these patches.
> Fixing that probably requires a greater overhaul.  Ceph and NFS also pass
> the expected tests.
> 
> These patches can be found also on:
> 
> 	https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git/log/?h=fscache-netfs-lib
> 
> For diffing reference, the tag for the 9th Feb pull request is
> fscache-ioapi-20210203 and can be found in the same repository.
> 
> 
> 
> Changes
> =======
> 
>  (v3) Rolled in the bug fixes.
> 
>       Adjusted the functions that unlock and wait for PG_fscache according
>       to Linus's suggestion.
> 
>       Hold a ref on a page when PG_fscache is set as per Linus's
>       suggestion.
> 
>       Dropped NFS support and added Ceph support.
> 
>  (v2) Fixed some bugs and added NFS support.
> 
> 
> References
> ==========
> 
> These patches have been published for review before, firstly as part of a
> larger set:
> 
> Link: https://lore.kernel.org/linux-fsdevel/158861203563.340223.7585359869938129395.stgit@xxxxxxxxxxxxxxxxxxxxxx/
> 
> Link: https://lore.kernel.org/linux-fsdevel/159465766378.1376105.11619976251039287525.stgit@xxxxxxxxxxxxxxxxxxxxxx/
> Link: https://lore.kernel.org/linux-fsdevel/159465784033.1376674.18106463693989811037.stgit@xxxxxxxxxxxxxxxxxxxxxx/
> Link: https://lore.kernel.org/linux-fsdevel/159465821598.1377938.2046362270225008168.stgit@xxxxxxxxxxxxxxxxxxxxxx/
> 
> Link: https://lore.kernel.org/linux-fsdevel/160588455242.3465195.3214733858273019178.stgit@xxxxxxxxxxxxxxxxxxxxxx/
> 
> Then as a cut-down set:
> 
> Link: https://lore.kernel.org/linux-fsdevel/161118128472.1232039.11746799833066425131.stgit@xxxxxxxxxxxxxxxxxxxxxx/
> 
> Link: https://lore.kernel.org/linux-fsdevel/161161025063.2537118.2009249444682241405.stgit@xxxxxxxxxxxxxxxxxxxxxx/
> 
> 
> Proposals/information about the design has been published here:
> 
> Link: https://lore.kernel.org/lkml/24942.1573667720@xxxxxxxxxxxxxxxxxxxxxx/
> Link: https://lore.kernel.org/linux-fsdevel/2758811.1610621106@xxxxxxxxxxxxxxxxxxxxxx/
> Link: https://lore.kernel.org/linux-fsdevel/1441311.1598547738@xxxxxxxxxxxxxxxxxxxxxx/
> Link: https://lore.kernel.org/linux-fsdevel/160655.1611012999@xxxxxxxxxxxxxxxxxxxxxx/
> 
> And requests for information:
> 
> Link: https://lore.kernel.org/linux-fsdevel/3326.1579019665@xxxxxxxxxxxxxxxxxxxxxx/
> Link: https://lore.kernel.org/linux-fsdevel/4467.1579020509@xxxxxxxxxxxxxxxxxxxxxx/
> Link: https://lore.kernel.org/linux-fsdevel/3577430.1579705075@xxxxxxxxxxxxxxxxxxxxxx/
> 
> The NFS parts, though not included here, have been tested by someone who's
> using fscache in production:
> 
> Link: https://listman.redhat.com/archives/linux-cachefs/2020-December/msg00000.html
> 
> I've posted partial patches to try and help 9p and cifs along:
> 
> Link: https://lore.kernel.org/linux-fsdevel/1514086.1605697347@xxxxxxxxxxxxxxxxxxxxxx/
> Link: https://lore.kernel.org/linux-cifs/1794123.1605713481@xxxxxxxxxxxxxxxxxxxxxx/
> Link: https://lore.kernel.org/linux-fsdevel/241017.1612263863@xxxxxxxxxxxxxxxxxxxxxx/
> Link: https://lore.kernel.org/linux-cifs/270998.1612265397@xxxxxxxxxxxxxxxxxxxxxx/
> 
> David
> ---
> David Howells (27):
>       iov_iter: Add ITER_XARRAY
>       mm: Add an unlock function for PG_private_2/PG_fscache
>       mm: Implement readahead_control pageset expansion
>       vfs: Export rw_verify_area() for use by cachefiles
>       netfs: Make a netfs helper module
>       netfs, mm: Move PG_fscache helper funcs to linux/netfs.h
>       netfs, mm: Add unlock_page_fscache() and wait_on_page_fscache()
>       netfs: Provide readahead and readpage netfs helpers
>       netfs: Add tracepoints
>       netfs: Gather stats
>       netfs: Add write_begin helper
>       netfs: Define an interface to talk to a cache
>       netfs: Hold a ref on a page when PG_private_2 is set
>       fscache, cachefiles: Add alternate API to use kiocb for read/write to cache
>       afs: Disable use of the fscache I/O routines
>       afs: Pass page into dirty region helpers to provide THP size
>       afs: Print the operation debug_id when logging an unexpected data version
>       afs: Move key to afs_read struct
>       afs: Don't truncate iter during data fetch
>       afs: Log remote unmarshalling errors
>       afs: Set up the iov_iter before calling afs_extract_data()
>       afs: Use ITER_XARRAY for writing
>       afs: Wait on PG_fscache before modifying/releasing a page
>       afs: Extract writeback extension into its own function
>       afs: Prepare for use of THPs
>       afs: Use the fs operation ops to handle FetchData completion
>       afs: Use new fscache read helper API
> 
> Jeff Layton (6):
>       ceph: disable old fscache readpage handling
>       ceph: rework PageFsCache handling
>       ceph: fix fscache invalidation
>       ceph: convert readpage to fscache read helper
>       ceph: plug write_begin into read helper
>       ceph: convert ceph_readpages to ceph_readahead
> 
> 
>  fs/Kconfig                    |    1 +
>  fs/Makefile                   |    1 +
>  fs/afs/Kconfig                |    1 +
>  fs/afs/dir.c                  |  225 ++++---
>  fs/afs/file.c                 |  470 ++++---------
>  fs/afs/fs_operation.c         |    4 +-
>  fs/afs/fsclient.c             |  108 +--
>  fs/afs/inode.c                |    7 +-
>  fs/afs/internal.h             |   58 +-
>  fs/afs/rxrpc.c                |  150 ++---
>  fs/afs/write.c                |  610 +++++++++--------
>  fs/afs/yfsclient.c            |   82 +--
>  fs/cachefiles/Makefile        |    1 +
>  fs/cachefiles/interface.c     |    5 +-
>  fs/cachefiles/internal.h      |    9 +
>  fs/cachefiles/rdwr2.c         |  412 ++++++++++++
>  fs/ceph/Kconfig               |    1 +
>  fs/ceph/addr.c                |  535 ++++++---------
>  fs/ceph/cache.c               |  125 ----
>  fs/ceph/cache.h               |  101 +--
>  fs/ceph/caps.c                |   10 +-
>  fs/ceph/inode.c               |    1 +
>  fs/ceph/super.h               |    1 +
>  fs/fscache/Kconfig            |    1 +
>  fs/fscache/Makefile           |    3 +-
>  fs/fscache/internal.h         |    3 +
>  fs/fscache/page.c             |    2 +-
>  fs/fscache/page2.c            |  117 ++++
>  fs/fscache/stats.c            |    1 +
>  fs/internal.h                 |    5 -
>  fs/netfs/Kconfig              |   23 +
>  fs/netfs/Makefile             |    5 +
>  fs/netfs/internal.h           |   97 +++
>  fs/netfs/read_helper.c        | 1169 +++++++++++++++++++++++++++++++++
>  fs/netfs/stats.c              |   59 ++
>  fs/read_write.c               |    1 +
>  include/linux/fs.h            |    1 +
>  include/linux/fscache-cache.h |    4 +
>  include/linux/fscache.h       |   40 +-
>  include/linux/netfs.h         |  195 ++++++
>  include/linux/pagemap.h       |    3 +
>  include/net/af_rxrpc.h        |    2 +-
>  include/trace/events/afs.h    |   74 +--
>  include/trace/events/netfs.h  |  201 ++++++
>  mm/filemap.c                  |   20 +
>  mm/readahead.c                |   70 ++
>  net/rxrpc/recvmsg.c           |    9 +-
>  47 files changed, 3473 insertions(+), 1550 deletions(-)
>  create mode 100644 fs/cachefiles/rdwr2.c
>  create mode 100644 fs/fscache/page2.c
>  create mode 100644 fs/netfs/Kconfig
>  create mode 100644 fs/netfs/Makefile
>  create mode 100644 fs/netfs/internal.h
>  create mode 100644 fs/netfs/read_helper.c
>  create mode 100644 fs/netfs/stats.c
>  create mode 100644 include/linux/netfs.h
>  create mode 100644 include/trace/events/netfs.h
> 
> 

Thanks David,

I did an xfstests run on ceph with a kernel based on this and it seemed
to do fine. I'll plan to pull this into the ceph-client/testing branch
and run it through the ceph kclient test harness. There are only a few
differences from the last run we did, so I'm not expecting big changes,
but I'll keep you posted.

-- 
Jeff Layton <jlayton@xxxxxxxxxx>




[Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux