Re: [RFC PATCH] NFS: CB_OFFLOAD should return DELAY when no copy state ID matches

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 2025-02-13 at 12:53 -0500, Olga Kornievskaia wrote:
> On Thu, Feb 13, 2025 at 11:59 AM Trond Myklebust
> <trondmy@xxxxxxxxxxxxxxx> wrote:
> > 
> > On Thu, 2025-02-13 at 11:15 -0500, cel@xxxxxxxxxx wrote:
> > > From: Chuck Lever <chuck.lever@xxxxxxxxxx>
> > > 
> > > The NFSv4.2 protocol requires that a client match a CB_OFFLOAD
> > > callback to a COPY reply containing the same copy state ID.
> > > However,
> > > it's possible that the order of the callback and reply processing
> > > on
> > > the client can cause the CB_OFFLOAD to be received and processed
> > > /before/ the client has dealt with the COPY reply.
> > > 
> > > Currently, in this case, the Linux NFS client will queue a fresh
> > > struct nfs4_copy_state in the CB_OFFLOAD handler.
> > > handle_async_copy() then checks for a matching nfs4_copy_state
> > > before settling down to wait for a CB_OFFLOAD reply.
> > > 
> > > But it would be simpler for the client's callback service to
> > > respond
> > > to such a CB_OFFLOAD with "I'm not ready yet" and have the server
> > > send the CB_OFFLOAD again later. This avoids the need for the
> > > client's CB_OFFLOAD processing to allocate an extra struct
> > > nfs4_copy_state -- in most cases that allocation will be tossed
> > > immediately, and it's one less memory allocation that we have to
> > > worry about accidentally leaking or accumulating over time.
> > 
> > Why can't the server just fill an appropriate entry for
> > csa_referring_call_lists<> in the CB_SEQUENCE operation for the
> > CB_OFFLOAD callback? That's the mechanism that is intended to be
> > used
> > to avoid the above kind of race.
> 
> Let's say the linux server does implement the list but what about
> other implementations that will not. The client still needs an
> approach to handle CB_OFFLOAD/COPY reply.
> > 

There are several cases that need to be handled. Off the top of my
head:
   1. The reply to COPY hasn't yet been processed.
   2. The COPY is complete, and the state has been forgotten.
   3. The stateid presented by CB_OFFLOAD is one that was reused for a
      second COPY request after a previous one completed.

The client will want to send different errors for either case
(NFS4ERR_DELAY in the first and third case, NFS4ERR_BAD_STATEID in the
second).
With csa_referring_call_lists<>, the client can easily distinguish
between the cases and return the right response. Without it, the client
might end up returning NFS4ERR_BAD_STATEID in case 3, or NFS4ERR_DELAY
in case 2, etc...

So in practice, we want all servers to do the right thing if they want
to avoid confusion over state. The client can't fix these races on its
own.

-- 
Trond Myklebust Linux NFS client maintainer, Hammerspace
trond.myklebust@xxxxxxxxxxxxxxx




[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux