Re: RFC: Approaches to resolve netfs API interface to NFS multiple completions problem

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 2021-03-31 at 13:49 -0400, David Wysochanski wrote:
> Trond,
> 
> I've been working on getting NFS converted to dhowells new fscache
> and
> netfs APIs and running into a problem with how NFS is designed and it
> involves the NFS pagelist.c / pgio API.  I'd appreciate it if you
> could review and give your thoughts on possible approaches.  I've
> tried to outline some of the possibilities below.  I tried coding
> option #3 and ran into some problems, and it has a serialization
> limitation.  At this point I'm leaning towards option 2, so I'll
> probably try that approach if you don't have time for review or have
> strong thoughts on it.
> 

I am not going through another redesign of the NFS code in order to
accommodate another cachefs design. If netfs needs a refactoring or
redesign of the I/O code then it will be immediately NACKed.

Why does netfs need to know these details about the NFS code anyway?

> Thanks.
> 
> 
> Problem: The NFS pageio interface does not expose a max read length
> that
> we can easily use inside netfs clamp_length() function.  As a result,
> when
> issue_op() is called indicating a single netfs subrequest, this can
> be
> split into
> multiple NFS subrequests / RPCs inside guts of NFS pageio code. 
> Multiple
> NFS subrequests requests leads to multiple completions, and the netfs
> API expects a 1:1 mapping between issue_op() and
> netfs_subreq_terminated() calls.
> 
> Details of the NFS pageio API (see include/linux/nfs_page.h and
> fs/nfs/pagelist.c)
> Details of the netfs API (see include/linux/netfs.h and
> fs/netfs/read_helper.c)
> 
> The NFS pageio API 3 main calls are as follows:
> 1. nfs_pageio_init(): initialize a pageio structure (R/W IO of N
> pages)
> 2. nfs_pageio_add_request(): called for each page to add to an IO
> * Calls nfs_pageio_add_request_mirror -> __nfs_pageio_add_request
>   * __nfs_pageio_add_request may call nfs_pageio_doio() which
> actually
>     sends an RPC over the wire if page cannot be added to the request
>     ("coalesced") due to various factors.  For more details, see
>     nfs_pageio_do_add_request() and all underlying code it calls such
>     as nfs_coalesce_size() and subsequent pgio->pg_ops->pg_test()
> calls
> 3. nfs_pageio_complete() - "complete" the pageio
> * calls nfs_pageio_complete_mirror -> nfs_pageio_doio()
> 
> The NFS pageio API thus may generate multiple over the wire RPCs
> and thus multiple completions even though at the high level only
> one call to nfs_pageio_complete() is made.
> 
> Option 1: Just use NFS pageio API as is, and deal with possible
> multiple
> completions.
> - Inconsistent with netfs design intent
> - Need to keep track of the RPC completion status, and for example,
> if one completes with success and one an error, probably call
> netfs_subreq_terminated() with the error.
> - There's no way for the caller of the NFS pageio API to know how
> many RPCs and thus completions may occur.  Thus, it's unclear how
> one would distinguish between a READ that resulted in a single RPC
> over the wire that completed as a short read, and a READ that
> resulted in multiple RPCs that would each complete separately,
> but would eventually complete
> 
> Option 2: Create a more complex 'clamp_length()' function for NFS,
> taking into account all ways NFS / pNFS code can split a read.
> + Consistent with netfs design intent
> + Multiple "split" requests would be called in parallel (see loop
> inside netfs_readahead, which repeatedly calls
> netfs_rreq_submit_slice)
> - Looks impossible without refactoring of NFS pgio API.  We need
> to prevent nfs_pageio_add_request() from calling nfs_pagio_doio(),
> and return some indication coalesce failed.  In addition, it may
> run into problems with fallback from DS to MDS for example (see
> commit d9156f9f364897e93bdd98b4ad22138de18f7c24).
> 
> Option 3: Utilize NETFS_SREQ_SHORT_READ flag as needed.
> + Consistent with netfs design intent
> - Multiple "split" requests would be serialized (see code
> paths inside netfs_subreq_terminated that check for this flag).
> - Looks impossible without some refactoring of NFS pgio API.
> * Notes: Terminate NFS pageio page based loop at the first call
> to nfs_pageio_doio().  When a READ completes, NFS calls
> netfs_subreq_terminated() with NETFS_SREQ_SHORT_READ
> and is prepared to have the rest of the subrequest be resubmitted.
> Need to somehow fail early or avoid entirely subsequent calls to
> nfs_pagio_doio() for the original request though, and handle
> error status only from the first RPC.
> 
> Option 4: Add some final completion routine to be called near
> bottom of nfs_pageio_complete() and would pass in at least
> netfs_read_subrequest(), possibly nfs_pageio_descriptor.
> + Inconsistent with netfs design intent
> - Would be a new NFS API or call on top of everything
> - Need to handle the "multiple completion with different
> status" problem (see #1).
> 

-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@xxxxxxxxxxxxxxx






[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux