Re: [RFC PATCH] NFS: CB_OFFLOAD should return DELAY when no copy state ID matches

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 2025-02-13 at 13:47 -0500, Chuck Lever wrote:
> On 2/13/25 1:44 PM, Trond Myklebust wrote:
> > On Thu, 2025-02-13 at 12:53 -0500, Olga Kornievskaia wrote:
> > > On Thu, Feb 13, 2025 at 11:59 AM Trond Myklebust
> > > <trondmy@xxxxxxxxxxxxxxx> wrote:
> > > > 
> > > > On Thu, 2025-02-13 at 11:15 -0500, cel@xxxxxxxxxx wrote:
> > > > > From: Chuck Lever <chuck.lever@xxxxxxxxxx>
> > > > > 
> > > > > The NFSv4.2 protocol requires that a client match a
> > > > > CB_OFFLOAD
> > > > > callback to a COPY reply containing the same copy state ID.
> > > > > However,
> > > > > it's possible that the order of the callback and reply
> > > > > processing
> > > > > on
> > > > > the client can cause the CB_OFFLOAD to be received and
> > > > > processed
> > > > > /before/ the client has dealt with the COPY reply.
> > > > > 
> > > > > Currently, in this case, the Linux NFS client will queue a
> > > > > fresh
> > > > > struct nfs4_copy_state in the CB_OFFLOAD handler.
> > > > > handle_async_copy() then checks for a matching
> > > > > nfs4_copy_state
> > > > > before settling down to wait for a CB_OFFLOAD reply.
> > > > > 
> > > > > But it would be simpler for the client's callback service to
> > > > > respond
> > > > > to such a CB_OFFLOAD with "I'm not ready yet" and have the
> > > > > server
> > > > > send the CB_OFFLOAD again later. This avoids the need for the
> > > > > client's CB_OFFLOAD processing to allocate an extra struct
> > > > > nfs4_copy_state -- in most cases that allocation will be
> > > > > tossed
> > > > > immediately, and it's one less memory allocation that we have
> > > > > to
> > > > > worry about accidentally leaking or accumulating over time.
> > > > 
> > > > Why can't the server just fill an appropriate entry for
> > > > csa_referring_call_lists<> in the CB_SEQUENCE operation for the
> > > > CB_OFFLOAD callback? That's the mechanism that is intended to
> > > > be
> > > > used
> > > > to avoid the above kind of race.
> > > 
> > > Let's say the linux server does implement the list but what about
> > > other implementations that will not. The client still needs an
> > > approach to handle CB_OFFLOAD/COPY reply.
> > > > 
> > 
> > There are several cases that need to be handled. Off the top of my
> > head:
> >    1. The reply to COPY hasn't yet been processed.
> >    2. The COPY is complete, and the state has been forgotten.
> >    3. The stateid presented by CB_OFFLOAD is one that was reused
> > for a
> >       second COPY request after a previous one completed.
> > 
> > The client will want to send different errors for either case
> > (NFS4ERR_DELAY in the first and third case, NFS4ERR_BAD_STATEID in
> > the
> > second).
> > With csa_referring_call_lists<>, the client can easily distinguish
> > between the cases and return the right response. Without it, the
> > client
> > might end up returning NFS4ERR_BAD_STATEID in case 3, or
> > NFS4ERR_DELAY
> > in case 2, etc...
> > 
> > So in practice, we want all servers to do the right thing if they
> > want
> > to avoid confusion over state. The client can't fix these races on
> > its
> > own.
> > 
> 
> We are currently living in a world where all NFSD-based servers do
> not
> return referring calls. I think we need to understand what the client
> should do in those cases.


My answer is: not try to fix that which cannot be fixed.
> 

-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@xxxxxxxxxxxxxxx






[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux